Message ID | 343c844bbc5081d13ee4c9aa27ff3118f607e1cc.1539092112.git.christophe.leroy@c-s.fr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ban the use of _PAGE_XXX flags outside platform specific code | expand |
Christophe Leroy <christophe.leroy@c-s.fr> writes: > Get rid of platform specific _PAGE_XXXX in powerpc common code and > use helpers instead. > > mm/dump_linuxpagetables.c will be handled separately > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > --- > arch/powerpc/include/asm/book3s/32/pgtable.h | 9 +++------ > arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++---- > arch/powerpc/include/asm/nohash/pgtable.h | 3 +-- > arch/powerpc/mm/pgtable.c | 21 +++++++-------------- > arch/powerpc/mm/pgtable_32.c | 15 ++++++++------- > arch/powerpc/mm/pgtable_64.c | 14 +++++++------- > arch/powerpc/xmon/xmon.c | 12 +++++++----- > 7 files changed, 41 insertions(+), 45 deletions(-) So turns out this patch *also* breaks my p5020ds :) Even with patch 4 merged, see next. It's the same crash: pcieport 2000:00:00.0: AER enabled with IRQ 480 Unable to handle kernel paging request for data at address 0x8000080080080000 Faulting instruction address: 0xc0000000000192cc Oops: Kernel access of bad area, sig: 11 [#1] BE SMP NR_CPUS=24 CoreNet Generic Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1 NIP: c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000 REGS: c0000000f31bb400 TRAP: 0300 Not tainted (4.19.0-rc3-gcc7x-g98c847323b3a) MSR: 0000000080029000 <CE,EE,ME> CR: 24000224 XER: 00000000 DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0 GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000 GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000 GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710 GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8 GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98 NIP [c0000000000192cc] ._memset_io+0x6c/0x9c LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928 Call Trace: [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable) [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4 [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118 [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180 [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38 [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8 [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494 [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148 [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70 Instruction dump: 4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001 550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003 Comparing a working vs broken kernel, it seems to boil down to the fact that we're filtering out more PTE bits now that we use pte_pgprot() in ioremap_prot(). With the old code we get: ioremap_prot: addr 0xff800000 flags 0x241215 ioremap_prot: addr 0xff800000 flags 0x241215 map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215 And now we get: ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215 ioremap_prot: addr 0xff800000 pte 0x241215 ioremap_prot: addr 0xff800000 prot 0x241014 map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014 So we're losing 0x201, which for nohash book3e is: #define _PAGE_PRESENT 0x000001 /* software: pte contains a translation */ #define _PAGE_PSIZE_4K 0x000200 I haven't worked out if it's one or both of those that matter. The question is what's the right way to fix it? Should pte_pgprot() not be filtering those bits out on book3e? cheers
On 10/17/2018 12:59 AM, Michael Ellerman wrote: > Christophe Leroy <christophe.leroy@c-s.fr> writes: > >> Get rid of platform specific _PAGE_XXXX in powerpc common code and >> use helpers instead. >> >> mm/dump_linuxpagetables.c will be handled separately >> >> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> >> --- >> arch/powerpc/include/asm/book3s/32/pgtable.h | 9 +++------ >> arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++---- >> arch/powerpc/include/asm/nohash/pgtable.h | 3 +-- >> arch/powerpc/mm/pgtable.c | 21 +++++++-------------- >> arch/powerpc/mm/pgtable_32.c | 15 ++++++++------- >> arch/powerpc/mm/pgtable_64.c | 14 +++++++------- >> arch/powerpc/xmon/xmon.c | 12 +++++++----- >> 7 files changed, 41 insertions(+), 45 deletions(-) > > So turns out this patch *also* breaks my p5020ds :) > > Even with patch 4 merged, see next. > > It's the same crash: > > pcieport 2000:00:00.0: AER enabled with IRQ 480 > Unable to handle kernel paging request for data at address 0x8000080080080000 > Faulting instruction address: 0xc0000000000192cc > Oops: Kernel access of bad area, sig: 11 [#1] > BE SMP NR_CPUS=24 CoreNet Generic > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1 > NIP: c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000 > REGS: c0000000f31bb400 TRAP: 0300 Not tainted (4.19.0-rc3-gcc7x-g98c847323b3a) > MSR: 0000000080029000 <CE,EE,ME> CR: 24000224 XER: 00000000 > DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0 > GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000 > GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000 > GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710 > GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000 > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8 > GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98 > NIP [c0000000000192cc] ._memset_io+0x6c/0x9c > LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928 > Call Trace: > [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable) > [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4 > [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c > [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c > [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118 > [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c > [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc > [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180 > [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c > [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38 > [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8 > [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494 > [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148 > [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70 > Instruction dump: > 4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001 > 550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003 > > > Comparing a working vs broken kernel, it seems to boil down to the fact > that we're filtering out more PTE bits now that we use pte_pgprot() in > ioremap_prot(). > > With the old code we get: > ioremap_prot: addr 0xff800000 flags 0x241215 > ioremap_prot: addr 0xff800000 flags 0x241215 > map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215 > > > And now we get: > ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215 > ioremap_prot: addr 0xff800000 pte 0x241215 > ioremap_prot: addr 0xff800000 prot 0x241014 > map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014 > > So we're losing 0x201, which for nohash book3e is: > > #define _PAGE_PRESENT 0x000001 /* software: pte contains a translation */ > #define _PAGE_PSIZE_4K 0x000200 > > > I haven't worked out if it's one or both of those that matter. At least missing _PAGE_PRESENT is an issue I believe. > > The question is what's the right way to fix it? Should pte_pgprot() not > be filtering those bits out on book3e? I think we should not use pte_pggrot() for that then. What about the below fix ? Christophe From: Christophe Leroy <christophe.leroy@c-s.fr> Date: Wed, 17 Oct 2018 05:56:25 +0000 Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot() pte_pgprot() filters out some required flags like _PAGE_PRESENT. This patch replaces pte_pgprot() by __pgprot(pte_val()) in ioremap_prot() Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- arch/powerpc/mm/pgtable_32.c | 3 ++- arch/powerpc/mm/pgtable_64.c | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 5877f5aa8f5d..a606e2f4937b 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) pte = pte_exprotect(pte); pte = pte_mkprivileged(pte); - return __ioremap_caller(addr, size, pte_pgprot(pte), __builtin_return_address(0)); + return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), + __builtin_return_address(0)); } EXPORT_SYMBOL(ioremap_prot); diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index fb1375c07e8c..836bf436cabb 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -245,8 +245,8 @@ void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size, pte = pte_mkprivileged(pte); if (ppc_md.ioremap) - return ppc_md.ioremap(addr, size, pte_pgprot(pte), caller); - return __ioremap_caller(addr, size, pte_pgprot(pte), caller); + return ppc_md.ioremap(addr, size, __pgprot(pte_val(pte)), caller); + return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), caller); }
Christophe Leroy <christophe.leroy@c-s.fr> writes: > On 10/17/2018 12:59 AM, Michael Ellerman wrote: >> Christophe Leroy <christophe.leroy@c-s.fr> writes: >> >>> Get rid of platform specific _PAGE_XXXX in powerpc common code and >>> use helpers instead. >>> >>> mm/dump_linuxpagetables.c will be handled separately >>> >>> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> >>> --- >>> arch/powerpc/include/asm/book3s/32/pgtable.h | 9 +++------ >>> arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++---- >>> arch/powerpc/include/asm/nohash/pgtable.h | 3 +-- >>> arch/powerpc/mm/pgtable.c | 21 +++++++-------------- >>> arch/powerpc/mm/pgtable_32.c | 15 ++++++++------- >>> arch/powerpc/mm/pgtable_64.c | 14 +++++++------- >>> arch/powerpc/xmon/xmon.c | 12 +++++++----- >>> 7 files changed, 41 insertions(+), 45 deletions(-) >> >> So turns out this patch *also* breaks my p5020ds :) >> >> Even with patch 4 merged, see next. >> >> It's the same crash: >> >> pcieport 2000:00:00.0: AER enabled with IRQ 480 >> Unable to handle kernel paging request for data at address 0x8000080080080000 >> Faulting instruction address: 0xc0000000000192cc >> Oops: Kernel access of bad area, sig: 11 [#1] >> BE SMP NR_CPUS=24 CoreNet Generic >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1 >> NIP: c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000 >> REGS: c0000000f31bb400 TRAP: 0300 Not tainted (4.19.0-rc3-gcc7x-g98c847323b3a) >> MSR: 0000000080029000 <CE,EE,ME> CR: 24000224 XER: 00000000 >> DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0 >> GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000 >> GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000 >> GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710 >> GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000 >> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >> GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8 >> GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98 >> NIP [c0000000000192cc] ._memset_io+0x6c/0x9c >> LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928 >> Call Trace: >> [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable) >> [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4 >> [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c >> [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c >> [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118 >> [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c >> [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc >> [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180 >> [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c >> [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38 >> [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8 >> [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494 >> [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148 >> [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70 >> Instruction dump: >> 4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001 >> 550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003 >> >> >> Comparing a working vs broken kernel, it seems to boil down to the fact >> that we're filtering out more PTE bits now that we use pte_pgprot() in >> ioremap_prot(). >> >> With the old code we get: >> ioremap_prot: addr 0xff800000 flags 0x241215 >> ioremap_prot: addr 0xff800000 flags 0x241215 >> map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215 >> >> >> And now we get: >> ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215 >> ioremap_prot: addr 0xff800000 pte 0x241215 >> ioremap_prot: addr 0xff800000 prot 0x241014 >> map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014 >> >> So we're losing 0x201, which for nohash book3e is: >> >> #define _PAGE_PRESENT 0x000001 /* software: pte contains a translation */ >> #define _PAGE_PSIZE_4K 0x000200 >> >> >> I haven't worked out if it's one or both of those that matter. > > At least missing _PAGE_PRESENT is an issue I believe. >> >> The question is what's the right way to fix it? Should pte_pgprot() not >> be filtering those bits out on book3e? > > I think we should not use pte_pggrot() for that then. What about the > below fix ? > > Christophe > > From: Christophe Leroy <christophe.leroy@c-s.fr> > Date: Wed, 17 Oct 2018 05:56:25 +0000 > Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot() > > pte_pgprot() filters out some required flags like _PAGE_PRESENT. > > This patch replaces pte_pgprot() by __pgprot(pte_val()) > in ioremap_prot() > > Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code") > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > --- > arch/powerpc/mm/pgtable_32.c | 3 ++- > arch/powerpc/mm/pgtable_64.c | 4 ++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c > index 5877f5aa8f5d..a606e2f4937b 100644 > --- a/arch/powerpc/mm/pgtable_32.c > +++ b/arch/powerpc/mm/pgtable_32.c > @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, > unsigned long flags) > pte = pte_exprotect(pte); > pte = pte_mkprivileged(pte); > > - return __ioremap_caller(addr, size, pte_pgprot(pte), > __builtin_return_address(0)); > + return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), > + __builtin_return_address(0)); That means we pass the pfn bits also to __ioremap_caller right? How about From b4d5e0f24f8482375b2dd86afaced26ebf716600 Mon Sep 17 00:00:00 2001 From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Date: Wed, 17 Oct 2018 14:07:50 +0530 Subject: [PATCH] powerpc/mm: Make pte_pgprot return all pte bits Other archs do the same and instead of adding required pte bits (which got masked out) in __ioremap_at, make sure we filter only pfn bits out. Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> --- arch/powerpc/include/asm/book3s/32/pgtable.h | 6 ------ arch/powerpc/include/asm/book3s/64/pgtable.h | 8 -------- arch/powerpc/include/asm/nohash/32/pte-40x.h | 5 ----- arch/powerpc/include/asm/nohash/32/pte-44x.h | 5 ----- arch/powerpc/include/asm/nohash/32/pte-8xx.h | 5 ----- arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h | 5 ----- arch/powerpc/include/asm/nohash/pgtable.h | 1 - arch/powerpc/include/asm/nohash/pte-book3e.h | 5 ----- arch/powerpc/include/asm/pgtable.h | 10 ++++++++++ 9 files changed, 10 insertions(+), 40 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 0fbd4c642b51..e61dd3ae5bc0 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -48,11 +48,6 @@ static inline bool pte_user(pte_t pte) #define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HASHPTE | _PAGE_DIRTY | \ _PAGE_ACCESSED | _PAGE_SPECIAL) -/* Mask of bits returned by pte_pgprot() */ -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ - _PAGE_RW | _PAGE_DIRTY) - /* * We define 2 sets of base prot bits, one for basic pages (ie, * cacheable kernel and user pages) and one for non cacheable @@ -396,7 +391,6 @@ static inline int pte_young(pte_t pte) { return !!(pte_val(pte) & _PAGE_ACCESSE static inline int pte_special(pte_t pte) { return !!(pte_val(pte) & _PAGE_SPECIAL); } static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; } static inline bool pte_exec(pte_t pte) { return true; } -static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PAGE_PROT_BITS); } static inline int pte_present(pte_t pte) { diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index c34a161dc651..cb5dd4078d42 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -128,13 +128,6 @@ #define H_PTE_PKEY (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \ H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4) -/* - * Mask of bits returned by pte_pgprot() - */ -#define PAGE_PROT_BITS (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \ - H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \ - _PAGE_READ | _PAGE_WRITE | _PAGE_DIRTY | _PAGE_EXEC | \ - _PAGE_SOFT_DIRTY | H_PTE_PKEY) /* * We define 2 sets of base prot bits, one for basic pages (ie, * cacheable kernel and user pages) and one for non cacheable @@ -496,7 +489,6 @@ static inline bool pte_exec(pte_t pte) return !!(pte_raw(pte) & cpu_to_be64(_PAGE_EXEC)); } -static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PAGE_PROT_BITS); } #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline bool pte_soft_dirty(pte_t pte) diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h index 7a8b3c94592f..661f4599f2fc 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-40x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h @@ -73,11 +73,6 @@ /* Until my rework is finished, 40x still needs atomic PTE updates */ #define PTE_ATOMIC_UPDATES 1 -/* Mask of bits returned by pte_pgprot() */ -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_NO_CACHE | \ - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ - _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC) - #define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED) #define _PAGE_BASE (_PAGE_BASE_NC) diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h index 8d6b268a986f..78bc304f750e 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-44x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h @@ -93,11 +93,6 @@ #define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW) #define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC) -/* Mask of bits returned by pte_pgprot() */ -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ - _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC) - /* TODO: Add large page lowmem mapping support */ #define _PMD_PRESENT 0 #define _PMD_PRESENT_MASK (PAGE_MASK) diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h index 1c57efac089d..6bfe041ef59d 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h @@ -55,11 +55,6 @@ #define _PAGE_KERNEL_RW (_PAGE_SH | _PAGE_DIRTY) #define _PAGE_KERNEL_RWX (_PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC) -/* Mask of bits returned by pte_pgprot() */ -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_NO_CACHE | \ - _PAGE_ACCESSED | _PAGE_RO | _PAGE_NA | \ - _PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC) - #define _PMD_PRESENT 0x0001 #define _PMD_PRESENT_MASK _PMD_PRESENT #define _PMD_BAD 0x0fd0 diff --git a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h index 1ecf60fe0909..0fc1bd42bb3e 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h +++ b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h @@ -39,11 +39,6 @@ /* No page size encoding in the linux PTE */ #define _PAGE_PSIZE 0 -/* Mask of bits returned by pte_pgprot() */ -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ - _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC) - #define _PMD_PRESENT 0 #define _PMD_PRESENT_MASK (PAGE_MASK) #define _PMD_BAD (~PAGE_MASK) diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 04e9f0922ad4..70ff23974b59 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -52,7 +52,6 @@ static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) static inline bool pte_hashpte(pte_t pte) { return false; } static inline bool pte_ci(pte_t pte) { return pte_val(pte) & _PAGE_NO_CACHE; } static inline bool pte_exec(pte_t pte) { return pte_val(pte) & _PAGE_EXEC; } -static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PAGE_PROT_BITS); } #ifdef CONFIG_NUMA_BALANCING /* diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h index 58eef8cb569d..f95ab6eaf441 100644 --- a/arch/powerpc/include/asm/nohash/pte-book3e.h +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h @@ -82,11 +82,6 @@ #define _PTE_NONE_MASK 0 #endif -/* Mask of bits returned by pte_pgprot() */ -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ - _PAGE_PRIVILEGED | _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC) - /* * We define 2 sets of base prot bits, one for basic pages (ie, * cacheable kernel and user pages) and one for non cacheable diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index fb4b85bba110..9679b7519a35 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -46,6 +46,16 @@ struct mm_struct; /* Keep these as a macros to avoid include dependency mess */ #define pte_page(x) pfn_to_page(pte_pfn(x)) #define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot)) +/* + * Select all bits except the pfn + */ +static inline pgprot_t pte_pgprot(pte_t pte) +{ + unsigned long pte_flags; + + pte_flags = pte_val(pte) & ~PTE_RPN_MASK; + return __pgprot(pte_flags); +} /* * ZERO_PAGE is a global shared page that is always zero: used
Le 17/10/2018 à 11:39, Aneesh Kumar K.V a écrit : > Christophe Leroy <christophe.leroy@c-s.fr> writes: > >> On 10/17/2018 12:59 AM, Michael Ellerman wrote: >>> Christophe Leroy <christophe.leroy@c-s.fr> writes: >>> >>>> Get rid of platform specific _PAGE_XXXX in powerpc common code and >>>> use helpers instead. >>>> >>>> mm/dump_linuxpagetables.c will be handled separately >>>> >>>> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> >>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> >>>> --- >>>> arch/powerpc/include/asm/book3s/32/pgtable.h | 9 +++------ >>>> arch/powerpc/include/asm/nohash/32/pgtable.h | 12 ++++++++---- >>>> arch/powerpc/include/asm/nohash/pgtable.h | 3 +-- >>>> arch/powerpc/mm/pgtable.c | 21 +++++++-------------- >>>> arch/powerpc/mm/pgtable_32.c | 15 ++++++++------- >>>> arch/powerpc/mm/pgtable_64.c | 14 +++++++------- >>>> arch/powerpc/xmon/xmon.c | 12 +++++++----- >>>> 7 files changed, 41 insertions(+), 45 deletions(-) >>> >>> So turns out this patch *also* breaks my p5020ds :) >>> >>> Even with patch 4 merged, see next. >>> >>> It's the same crash: >>> >>> pcieport 2000:00:00.0: AER enabled with IRQ 480 >>> Unable to handle kernel paging request for data at address 0x8000080080080000 >>> Faulting instruction address: 0xc0000000000192cc >>> Oops: Kernel access of bad area, sig: 11 [#1] >>> BE SMP NR_CPUS=24 CoreNet Generic >>> Modules linked in: >>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc3-gcc7x-g98c847323b3a #1 >>> NIP: c0000000000192cc LR: c0000000005d0f9c CTR: 0000000000100000 >>> REGS: c0000000f31bb400 TRAP: 0300 Not tainted (4.19.0-rc3-gcc7x-g98c847323b3a) >>> MSR: 0000000080029000 <CE,EE,ME> CR: 24000224 XER: 00000000 >>> DEAR: 8000080080080000 ESR: 0000000000800000 IRQMASK: 0 >>> GPR00: c0000000005d0f84 c0000000f31bb688 c00000000117dc00 8000080080080000 >>> GPR04: 0000000000000000 0000000000400000 00000ffbff241010 c0000000f31b8000 >>> GPR08: 0000000000000000 0000000000100000 0000000000000000 c0000000012d4710 >>> GPR12: 0000000084000422 c0000000012ff000 c000000000002774 0000000000000000 >>> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >>> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 >>> GPR24: 0000000000000000 0000000000000000 8000080080080000 c0000000ffff89a8 >>> GPR28: c0000000f3576400 c0000000f3576410 0000000000400000 c0000000012ecc98 >>> NIP [c0000000000192cc] ._memset_io+0x6c/0x9c >>> LR [c0000000005d0f9c] .fsl_qman_probe+0x198/0x928 >>> Call Trace: >>> [c0000000f31bb688] [c0000000005d0f84] .fsl_qman_probe+0x180/0x928 (unreliable) >>> [c0000000f31bb728] [c0000000006432ec] .platform_drv_probe+0x60/0xb4 >>> [c0000000f31bb7a8] [c00000000064083c] .really_probe+0x294/0x35c >>> [c0000000f31bb848] [c000000000640d2c] .__driver_attach+0x148/0x14c >>> [c0000000f31bb8d8] [c00000000063d7dc] .bus_for_each_dev+0xb0/0x118 >>> [c0000000f31bb988] [c00000000063ff28] .driver_attach+0x34/0x4c >>> [c0000000f31bba08] [c00000000063f648] .bus_add_driver+0x174/0x2bc >>> [c0000000f31bbaa8] [c0000000006418bc] .driver_register+0x90/0x180 >>> [c0000000f31bbb28] [c000000000643270] .__platform_driver_register+0x60/0x7c >>> [c0000000f31bbba8] [c000000000ee2a70] .fsl_qman_driver_init+0x24/0x38 >>> [c0000000f31bbc18] [c0000000000023fc] .do_one_initcall+0x64/0x2b8 >>> [c0000000f31bbcf8] [c000000000e9f480] .kernel_init_freeable+0x3a8/0x494 >>> [c0000000f31bbda8] [c000000000002798] .kernel_init+0x24/0x148 >>> [c0000000f31bbe28] [c0000000000009e8] .ret_from_kernel_thread+0x58/0x70 >>> Instruction dump: >>> 4e800020 2ba50003 40dd003c 3925fffc 5488402e 7929f082 7d082378 39290001 >>> 550a801e 7d2903a6 7d4a4378 794a0020 <91430000> 38630004 4200fff8 70a50003 >>> >>> >>> Comparing a working vs broken kernel, it seems to boil down to the fact >>> that we're filtering out more PTE bits now that we use pte_pgprot() in >>> ioremap_prot(). >>> >>> With the old code we get: >>> ioremap_prot: addr 0xff800000 flags 0x241215 >>> ioremap_prot: addr 0xff800000 flags 0x241215 >>> map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241215 >>> >>> >>> And now we get: >>> ioremap_prot: addr 0xff800000 flags 0x241215 pte 0x241215 >>> ioremap_prot: addr 0xff800000 pte 0x241215 >>> ioremap_prot: addr 0xff800000 prot 0x241014 >>> map_kernel_page: ea 0x8000080080080000 pa 0xff800000 pte 0xff800241014 >>> >>> So we're losing 0x201, which for nohash book3e is: >>> >>> #define _PAGE_PRESENT 0x000001 /* software: pte contains a translation */ >>> #define _PAGE_PSIZE_4K 0x000200 >>> >>> >>> I haven't worked out if it's one or both of those that matter. >> >> At least missing _PAGE_PRESENT is an issue I believe. >>> >>> The question is what's the right way to fix it? Should pte_pgprot() not >>> be filtering those bits out on book3e? >> >> I think we should not use pte_pggrot() for that then. What about the >> below fix ? >> >> Christophe >> >> From: Christophe Leroy <christophe.leroy@c-s.fr> >> Date: Wed, 17 Oct 2018 05:56:25 +0000 >> Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot() >> >> pte_pgprot() filters out some required flags like _PAGE_PRESENT. >> >> This patch replaces pte_pgprot() by __pgprot(pte_val()) >> in ioremap_prot() >> >> Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code") >> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> >> --- >> arch/powerpc/mm/pgtable_32.c | 3 ++- >> arch/powerpc/mm/pgtable_64.c | 4 ++-- >> 2 files changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c >> index 5877f5aa8f5d..a606e2f4937b 100644 >> --- a/arch/powerpc/mm/pgtable_32.c >> +++ b/arch/powerpc/mm/pgtable_32.c >> @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, >> unsigned long flags) >> pte = pte_exprotect(pte); >> pte = pte_mkprivileged(pte); >> >> - return __ioremap_caller(addr, size, pte_pgprot(pte), >> __builtin_return_address(0)); >> + return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), >> + __builtin_return_address(0)); > > > That means we pass the pfn bits also to __ioremap_caller right? How about > > From b4d5e0f24f8482375b2dd86afaced26ebf716600 Mon Sep 17 00:00:00 2001 > From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> > Date: Wed, 17 Oct 2018 14:07:50 +0530 > Subject: [PATCH] powerpc/mm: Make pte_pgprot return all pte bits > > Other archs do the same and instead of adding required pte bits (which got > masked out) in __ioremap_at, make sure we filter only pfn bits out. > > Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code") > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Looks good for me. Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> > --- > arch/powerpc/include/asm/book3s/32/pgtable.h | 6 ------ > arch/powerpc/include/asm/book3s/64/pgtable.h | 8 -------- > arch/powerpc/include/asm/nohash/32/pte-40x.h | 5 ----- > arch/powerpc/include/asm/nohash/32/pte-44x.h | 5 ----- > arch/powerpc/include/asm/nohash/32/pte-8xx.h | 5 ----- > arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h | 5 ----- > arch/powerpc/include/asm/nohash/pgtable.h | 1 - > arch/powerpc/include/asm/nohash/pte-book3e.h | 5 ----- > arch/powerpc/include/asm/pgtable.h | 10 ++++++++++ > 9 files changed, 10 insertions(+), 40 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h > index 0fbd4c642b51..e61dd3ae5bc0 100644 > --- a/arch/powerpc/include/asm/book3s/32/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h > @@ -48,11 +48,6 @@ static inline bool pte_user(pte_t pte) > #define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HASHPTE | _PAGE_DIRTY | \ > _PAGE_ACCESSED | _PAGE_SPECIAL) > > -/* Mask of bits returned by pte_pgprot() */ > -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ > - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ > - _PAGE_RW | _PAGE_DIRTY) > - > /* > * We define 2 sets of base prot bits, one for basic pages (ie, > * cacheable kernel and user pages) and one for non cacheable > @@ -396,7 +391,6 @@ static inline int pte_young(pte_t pte) { return !!(pte_val(pte) & _PAGE_ACCESSE > static inline int pte_special(pte_t pte) { return !!(pte_val(pte) & _PAGE_SPECIAL); } > static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; } > static inline bool pte_exec(pte_t pte) { return true; } > -static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PAGE_PROT_BITS); } > > static inline int pte_present(pte_t pte) > { > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h > index c34a161dc651..cb5dd4078d42 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -128,13 +128,6 @@ > > #define H_PTE_PKEY (H_PTE_PKEY_BIT0 | H_PTE_PKEY_BIT1 | H_PTE_PKEY_BIT2 | \ > H_PTE_PKEY_BIT3 | H_PTE_PKEY_BIT4) > -/* > - * Mask of bits returned by pte_pgprot() > - */ > -#define PAGE_PROT_BITS (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \ > - H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \ > - _PAGE_READ | _PAGE_WRITE | _PAGE_DIRTY | _PAGE_EXEC | \ > - _PAGE_SOFT_DIRTY | H_PTE_PKEY) > /* > * We define 2 sets of base prot bits, one for basic pages (ie, > * cacheable kernel and user pages) and one for non cacheable > @@ -496,7 +489,6 @@ static inline bool pte_exec(pte_t pte) > return !!(pte_raw(pte) & cpu_to_be64(_PAGE_EXEC)); > } > > -static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PAGE_PROT_BITS); } > > #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY > static inline bool pte_soft_dirty(pte_t pte) > diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h > index 7a8b3c94592f..661f4599f2fc 100644 > --- a/arch/powerpc/include/asm/nohash/32/pte-40x.h > +++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h > @@ -73,11 +73,6 @@ > /* Until my rework is finished, 40x still needs atomic PTE updates */ > #define PTE_ATOMIC_UPDATES 1 > > -/* Mask of bits returned by pte_pgprot() */ > -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_NO_CACHE | \ > - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ > - _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC) > - > #define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED) > #define _PAGE_BASE (_PAGE_BASE_NC) > > diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h > index 8d6b268a986f..78bc304f750e 100644 > --- a/arch/powerpc/include/asm/nohash/32/pte-44x.h > +++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h > @@ -93,11 +93,6 @@ > #define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW) > #define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC) > > -/* Mask of bits returned by pte_pgprot() */ > -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ > - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ > - _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC) > - > /* TODO: Add large page lowmem mapping support */ > #define _PMD_PRESENT 0 > #define _PMD_PRESENT_MASK (PAGE_MASK) > diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h > index 1c57efac089d..6bfe041ef59d 100644 > --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h > +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h > @@ -55,11 +55,6 @@ > #define _PAGE_KERNEL_RW (_PAGE_SH | _PAGE_DIRTY) > #define _PAGE_KERNEL_RWX (_PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC) > > -/* Mask of bits returned by pte_pgprot() */ > -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_NO_CACHE | \ > - _PAGE_ACCESSED | _PAGE_RO | _PAGE_NA | \ > - _PAGE_SH | _PAGE_DIRTY | _PAGE_EXEC) > - > #define _PMD_PRESENT 0x0001 > #define _PMD_PRESENT_MASK _PMD_PRESENT > #define _PMD_BAD 0x0fd0 > diff --git a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h > index 1ecf60fe0909..0fc1bd42bb3e 100644 > --- a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h > +++ b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h > @@ -39,11 +39,6 @@ > /* No page size encoding in the linux PTE */ > #define _PAGE_PSIZE 0 > > -/* Mask of bits returned by pte_pgprot() */ > -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ > - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ > - _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC) > - > #define _PMD_PRESENT 0 > #define _PMD_PRESENT_MASK (PAGE_MASK) > #define _PMD_BAD (~PAGE_MASK) > diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h > index 04e9f0922ad4..70ff23974b59 100644 > --- a/arch/powerpc/include/asm/nohash/pgtable.h > +++ b/arch/powerpc/include/asm/nohash/pgtable.h > @@ -52,7 +52,6 @@ static inline int pte_none(pte_t pte) { return (pte_val(pte) & ~_PTE_NONE_MASK) > static inline bool pte_hashpte(pte_t pte) { return false; } > static inline bool pte_ci(pte_t pte) { return pte_val(pte) & _PAGE_NO_CACHE; } > static inline bool pte_exec(pte_t pte) { return pte_val(pte) & _PAGE_EXEC; } > -static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PAGE_PROT_BITS); } > > #ifdef CONFIG_NUMA_BALANCING > /* > diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h > index 58eef8cb569d..f95ab6eaf441 100644 > --- a/arch/powerpc/include/asm/nohash/pte-book3e.h > +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h > @@ -82,11 +82,6 @@ > #define _PTE_NONE_MASK 0 > #endif > > -/* Mask of bits returned by pte_pgprot() */ > -#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \ > - _PAGE_WRITETHRU | _PAGE_USER | _PAGE_ACCESSED | \ > - _PAGE_PRIVILEGED | _PAGE_RW | _PAGE_DIRTY | _PAGE_EXEC) > - > /* > * We define 2 sets of base prot bits, one for basic pages (ie, > * cacheable kernel and user pages) and one for non cacheable > diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h > index fb4b85bba110..9679b7519a35 100644 > --- a/arch/powerpc/include/asm/pgtable.h > +++ b/arch/powerpc/include/asm/pgtable.h > @@ -46,6 +46,16 @@ struct mm_struct; > /* Keep these as a macros to avoid include dependency mess */ > #define pte_page(x) pfn_to_page(pte_pfn(x)) > #define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot)) > +/* > + * Select all bits except the pfn > + */ > +static inline pgprot_t pte_pgprot(pte_t pte) > +{ > + unsigned long pte_flags; > + > + pte_flags = pte_val(pte) & ~PTE_RPN_MASK; > + return __pgprot(pte_flags); > +} > > /* > * ZERO_PAGE is a global shared page that is always zero: used >
Christophe Leroy <christophe.leroy@c-s.fr> writes: > On 10/17/2018 12:59 AM, Michael Ellerman wrote: ... >> The question is what's the right way to fix it? Should pte_pgprot() not >> be filtering those bits out on book3e? > > I think we should not use pte_pggrot() for that then. What about the > below fix ? Thanks, that almost works. pte_mkprivileged() also needs to not strip _PAGE_BAP_SR. But there's also a use of pte_pgprot() in mm/memory.c, and I think that is also broken now that we don't add PAGE_KERNEL back in. Aneesh is going to do a patch to make pte_pgprot() only mask the PFN which is what other arches do. cheers > From: Christophe Leroy <christophe.leroy@c-s.fr> > Date: Wed, 17 Oct 2018 05:56:25 +0000 > Subject: [PATCH] powerpc/mm: don't use pte_pgprot() in ioremap_prot() > > pte_pgprot() filters out some required flags like _PAGE_PRESENT. > > This patch replaces pte_pgprot() by __pgprot(pte_val()) > in ioremap_prot() > > Fixes: 26973fa5ac0e ("powerpc/mm: use pte helpers in generic code") > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > --- > arch/powerpc/mm/pgtable_32.c | 3 ++- > arch/powerpc/mm/pgtable_64.c | 4 ++-- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c > index 5877f5aa8f5d..a606e2f4937b 100644 > --- a/arch/powerpc/mm/pgtable_32.c > +++ b/arch/powerpc/mm/pgtable_32.c > @@ -122,7 +122,8 @@ ioremap_prot(phys_addr_t addr, unsigned long size, > unsigned long flags) > pte = pte_exprotect(pte); > pte = pte_mkprivileged(pte); > > - return __ioremap_caller(addr, size, pte_pgprot(pte), > __builtin_return_address(0)); > + return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), > + __builtin_return_address(0)); > } > EXPORT_SYMBOL(ioremap_prot); > > diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c > index fb1375c07e8c..836bf436cabb 100644 > --- a/arch/powerpc/mm/pgtable_64.c > +++ b/arch/powerpc/mm/pgtable_64.c > @@ -245,8 +245,8 @@ void __iomem * ioremap_prot(phys_addr_t addr, > unsigned long size, > pte = pte_mkprivileged(pte); > > if (ppc_md.ioremap) > - return ppc_md.ioremap(addr, size, pte_pgprot(pte), caller); > - return __ioremap_caller(addr, size, pte_pgprot(pte), caller); > + return ppc_md.ioremap(addr, size, __pgprot(pte_val(pte)), caller); > + return __ioremap_caller(addr, size, __pgprot(pte_val(pte)), caller); > } > > > -- > 2.13.3
On 10/17/2018 10:32 AM, Michael Ellerman wrote: > Christophe Leroy <christophe.leroy@c-s.fr> writes: >> On 10/17/2018 12:59 AM, Michael Ellerman wrote: > ... >>> The question is what's the right way to fix it? Should pte_pgprot() not >>> be filtering those bits out on book3e? >> >> I think we should not use pte_pggrot() for that then. What about the >> below fix ? > > Thanks, that almost works. > > pte_mkprivileged() also needs to not strip _PAGE_BAP_SR. Oops, I missed it allthough I knew it. Patch below. From: Christophe Leroy <christophe.leroy@c-s.fr> Date: Wed, 17 Oct 2018 10:46:24 +0000 Subject: [PATCH] powerpc/book3e: redefine pte_mkprivileged() and pte_mkuser() To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au> Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Book3e defines both _PAGE_USER and _PAGE_PRIVILEGED, so the nohash default pte_mkprivileged() and pte_mkuser() are not usable. This patch redefines them for book3e. Fixes: a0da4bc166f2 ("powerpc/mm: Allow platforms to redefine some helpers") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- arch/powerpc/include/asm/nohash/pte-book3e.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h index 58eef8cb569d..fb4297dff3e2 100644 --- a/arch/powerpc/include/asm/nohash/pte-book3e.h +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h @@ -109,5 +109,19 @@ #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_USER) #define PAGE_READONLY_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) +static inline pte_t pte_mkprivileged(pte_t pte) +{ + return __pte((pte_val(pte) & ~_PAGE_USER) | _PAGE_PRIVILEGED); +} + +#define pte_mkprivileged pte_mkprivileged + +static inline pte_t pte_mkuser(pte_t pte) +{ + return __pte((pte_val(pte) & ~_PAGE_PRIVILEGED) | _PAGE_USER); +} + +#define pte_mkuser pte_mkuser + #endif /* __KERNEL__ */ #endif /* _ASM_POWERPC_NOHASH_PTE_BOOK3E_H */
On 10/17/18 4:42 PM, Christophe Leroy wrote: > > > On 10/17/2018 10:32 AM, Michael Ellerman wrote: >> Christophe Leroy <christophe.leroy@c-s.fr> writes: >>> On 10/17/2018 12:59 AM, Michael Ellerman wrote: >> ... >>>> The question is what's the right way to fix it? Should pte_pgprot() not >>>> be filtering those bits out on book3e? >>> >>> I think we should not use pte_pggrot() for that then. What about the >>> below fix ? >> >> Thanks, that almost works. >> >> pte_mkprivileged() also needs to not strip _PAGE_BAP_SR. > > Oops, I missed it allthough I knew it. Patch below. > > From: Christophe Leroy <christophe.leroy@c-s.fr> > Date: Wed, 17 Oct 2018 10:46:24 +0000 > Subject: [PATCH] powerpc/book3e: redefine pte_mkprivileged() and > pte_mkuser() > To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Paul Mackerras > <paulus@samba.org>, Michael Ellerman <mpe@ellerman.id.au> > Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org > > Book3e defines both _PAGE_USER and _PAGE_PRIVILEGED, so the nohash > default pte_mkprivileged() and pte_mkuser() are not usable. > > This patch redefines them for book3e. > > Fixes: a0da4bc166f2 ("powerpc/mm: Allow platforms to redefine some > helpers") > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > --- > arch/powerpc/include/asm/nohash/pte-book3e.h | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h > b/arch/powerpc/include/asm/nohash/pte-book3e.h > index 58eef8cb569d..fb4297dff3e2 100644 > --- a/arch/powerpc/include/asm/nohash/pte-book3e.h > +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h > @@ -109,5 +109,19 @@ > #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_USER) > #define PAGE_READONLY_X __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC) > > +static inline pte_t pte_mkprivileged(pte_t pte) > +{ > + return __pte((pte_val(pte) & ~_PAGE_USER) | _PAGE_PRIVILEGED); > +} > + > +#define pte_mkprivileged pte_mkprivileged > + > +static inline pte_t pte_mkuser(pte_t pte) > +{ > + return __pte((pte_val(pte) & ~_PAGE_PRIVILEGED) | _PAGE_USER); > +} > + > +#define pte_mkuser pte_mkuser > + I was build testing a similar patch. We would need to put #ifndef __ASSEMBLY__ around it. -aneesh
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index a6ca799e0eb5..a0dc3a3eef33 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -331,17 +331,14 @@ static inline bool pte_ci(pte_t pte) #define pte_access_permitted pte_access_permitted static inline bool pte_access_permitted(pte_t pte, bool write) { - unsigned long pteval = pte_val(pte); /* * A read-only access is controlled by _PAGE_USER bit. * We have _PAGE_READ set for WRITE and EXECUTE */ - unsigned long need_pte_bits = _PAGE_PRESENT | _PAGE_USER; - - if (write) - need_pte_bits |= _PAGE_WRITE; + if (!pte_present(pte) || !pte_user(pte) || !pte_read(pte)) + return false; - if ((pteval & need_pte_bits) != need_pte_bits) + if (write && !pte_write(pte)) return false; return true; diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 6fecfd7854f5..a4156da4a7a4 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -277,7 +277,10 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - pte_update(ptep, (_PAGE_RW | _PAGE_HWWRITE), _PAGE_RO); + unsigned long clr = ~pte_val(pte_wrprotect(__pte(~0))); + unsigned long set = pte_val(pte_wrprotect(__pte(0))); + + pte_update(ptep, clr, set); } static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) @@ -291,9 +294,10 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, int psize) { - unsigned long set = pte_val(entry) & - (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC); - unsigned long clr = ~pte_val(entry) & (_PAGE_RO | _PAGE_NA); + pte_t pte_set = pte_mkyoung(pte_mkdirty(pte_mkwrite(pte_mkexec(__pte(0))))); + pte_t pte_clr = pte_mkyoung(pte_mkdirty(pte_mkwrite(pte_mkexec(__pte(~0))))); + unsigned long set = pte_val(entry) & pte_val(pte_set); + unsigned long clr = ~pte_val(entry) & ~pte_val(pte_clr); pte_update(ptep, clr, set); diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index b256e38a047c..062d96233673 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -32,8 +32,7 @@ static inline pgprot_t pte_pgprot(pte_t pte) { return __pgprot(pte_val(pte) & PA */ static inline int pte_protnone(pte_t pte) { - return (pte_val(pte) & - (_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT; + return pte_present(pte) && !pte_user(pte); } static inline int pmd_protnone(pmd_t pmd) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index f97d9c3760e3..ca4b1f7ac39d 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -44,20 +44,13 @@ static inline int is_exec_fault(void) static inline int pte_looks_normal(pte_t pte) { -#if defined(CONFIG_PPC_BOOK3S_64) - if ((pte_val(pte) & (_PAGE_PRESENT | _PAGE_SPECIAL)) == _PAGE_PRESENT) { + if (pte_present(pte) && !pte_special(pte)) { if (pte_ci(pte)) return 0; if (pte_user(pte)) return 1; } return 0; -#else - return (pte_val(pte) & - (_PAGE_PRESENT | _PAGE_SPECIAL | _PAGE_NO_CACHE | _PAGE_USER | - _PAGE_PRIVILEGED)) == - (_PAGE_PRESENT | _PAGE_USER); -#endif } static struct page *maybe_pte_to_page(pte_t pte) @@ -117,7 +110,7 @@ static pte_t set_pte_filter(pte_t pte) struct page *pg; /* No exec permission in the first place, move on */ - if (!(pte_val(pte) & _PAGE_EXEC) || !pte_looks_normal(pte)) + if (!pte_exec(pte) || !pte_looks_normal(pte)) return pte; /* If you set _PAGE_EXEC on weird pages you're on your own */ @@ -137,7 +130,7 @@ static pte_t set_pte_filter(pte_t pte) } /* Else, we filter out _PAGE_EXEC */ - return __pte(pte_val(pte) & ~_PAGE_EXEC); + return pte_exprotect(pte); } static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma, @@ -150,7 +143,7 @@ static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma, * if necessary. Also if _PAGE_EXEC is already set, same deal, * we just bail out */ - if (dirty || (pte_val(pte) & _PAGE_EXEC) || !is_exec_fault()) + if (dirty || pte_exec(pte) || !is_exec_fault()) return pte; #ifdef CONFIG_DEBUG_VM @@ -176,7 +169,7 @@ static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma, set_bit(PG_arch_1, &pg->flags); bail: - return __pte(pte_val(pte) | _PAGE_EXEC); + return pte_mkexec(pte); } #endif /* CONFIG_PPC_BOOK3S */ @@ -191,10 +184,10 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, * Make sure hardware valid bit is not set. We don't do * tlb flush for this update. */ - VM_WARN_ON(pte_val(*ptep) & _PAGE_PRESENT); + VM_WARN_ON(pte_hw_valid(*ptep)); /* Add the pte bit when trying to set a pte */ - pte = __pte(pte_val(pte) | _PAGE_PTE); + pte = pte_mkpte(pte); /* Note: mm->context.id might not yet have been assigned as * this context might not have been activated yet when this diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 01f348938328..5877f5aa8f5d 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -112,15 +112,17 @@ EXPORT_SYMBOL(ioremap_coherent); void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) { + pte_t pte = __pte(flags); + /* writeable implies dirty for kernel addresses */ - if ((flags & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO) - flags |= _PAGE_DIRTY | _PAGE_HWWRITE; + if (pte_write(pte)) + pte = pte_mkdirty(pte); /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ - flags &= ~(_PAGE_USER | _PAGE_EXEC); - flags |= _PAGE_PRIVILEGED; + pte = pte_exprotect(pte); + pte = pte_mkprivileged(pte); - return __ioremap_caller(addr, size, __pgprot(flags), __builtin_return_address(0)); + return __ioremap_caller(addr, size, pte_pgprot(pte), __builtin_return_address(0)); } EXPORT_SYMBOL(ioremap_prot); @@ -235,8 +237,7 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot) /* The PTE should never be already set nor present in the * hash table */ - BUG_ON((pte_val(*pg) & (_PAGE_PRESENT | _PAGE_HASHPTE)) && - pgprot_val(prot)); + BUG_ON((pte_present(*pg) | pte_hashpte(*pg)) && pgprot_val(prot)); set_pte_at(&init_mm, va, pg, pfn_pte(pa >> PAGE_SHIFT, prot)); } smp_wmb(); diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index b0f4a4b4f62b..fb1375c07e8c 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -230,23 +230,23 @@ void __iomem *ioremap_coherent(phys_addr_t addr, unsigned long size) void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) { + pte_t pte = __pte(flags); void *caller = __builtin_return_address(0); /* writeable implies dirty for kernel addresses */ - if (flags & _PAGE_WRITE) - flags |= _PAGE_DIRTY; + if (pte_write(pte)) + pte = pte_mkdirty(pte); /* we don't want to let _PAGE_EXEC leak out */ - flags &= ~_PAGE_EXEC; + pte = pte_exprotect(pte); /* * Force kernel mapping. */ - flags &= ~_PAGE_USER; - flags |= _PAGE_PRIVILEGED; + pte = pte_mkprivileged(pte); if (ppc_md.ioremap) - return ppc_md.ioremap(addr, size, __pgprot(flags), caller); - return __ioremap_caller(addr, size, __pgprot(flags), caller); + return ppc_md.ioremap(addr, size, pte_pgprot(pte), caller); + return __ioremap_caller(addr, size, pte_pgprot(pte), caller); } diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index c70d17c9a6ba..167271c7a97c 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -2993,15 +2993,17 @@ static void show_task(struct task_struct *tsk) #ifdef CONFIG_PPC_BOOK3S_64 void format_pte(void *ptep, unsigned long pte) { + pte_t entry = __pte(pte); + printf("ptep @ 0x%016lx = 0x%016lx\n", (unsigned long)ptep, pte); printf("Maps physical address = 0x%016lx\n", pte & PTE_RPN_MASK); printf("Flags = %s%s%s%s%s\n", - (pte & _PAGE_ACCESSED) ? "Accessed " : "", - (pte & _PAGE_DIRTY) ? "Dirty " : "", - (pte & _PAGE_READ) ? "Read " : "", - (pte & _PAGE_WRITE) ? "Write " : "", - (pte & _PAGE_EXEC) ? "Exec " : ""); + pte_young(entry) ? "Accessed " : "", + pte_dirty(entry) ? "Dirty " : "", + pte_read(entry) ? "Read " : "", + pte_write(entry) ? "Write " : "", + pte_exec(entry) ? "Exec " : ""); } static void show_pte(unsigned long addr)