Message ID | YX8HK7ZZZhjRQzcr@ls3530 (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | parisc: Fix code/instruction patching on PA1.x machines | expand |
Hi Helge, I think the real problem is that neither flush_kernel_vmap_range() or invalidate_kernel_vmap_range() flush the icache. They only operate on the data cache. flush_icache_range will flush both caches. Dave On 2021-10-31 5:14 p.m., Helge Deller wrote: > On PA1.x machines it's not sufficient to just flush the data and > instruction caches when we have written new instruction codes into the > parallel mapped memory segment, but we really need to invalidate (purge) > the cache too. Otherwise the processor will still execute the old > instructions which are still in the data/instruction cache. > > Signed-off-by: Helge Deller <deller@gmx.de> > Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") > Cc: stable@vger.kernel.org # v5.3+ > > diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c > index 80a0ab372802..8cbb7e1d5a2b 100644 > --- a/arch/parisc/kernel/patch.c > +++ b/arch/parisc/kernel/patch.c > @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > * We're crossing a page boundary, so > * need to remap > */ > - flush_kernel_vmap_range((void *)fixmap, > + invalidate_kernel_vmap_range((void *)fixmap, > (p-fixmap) * sizeof(*p)); > if (mapped) > patch_unmap(FIX_TEXT_POKE0, &flags); > @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > } > } > > - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); > + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); > if (mapped) > patch_unmap(FIX_TEXT_POKE0, &flags); > + invalidate_kernel_vmap_range((void *)start, end - start); > flush_icache_range(start, end); > } >
Hi Dave, On 11/3/21 21:12, John David Anglin wrote: > I think the real problem is that neither flush_kernel_vmap_range() or > invalidate_kernel_vmap_range() flush the icache. They only operate > on the data cache. flush_icache_range will flush both caches. Yes. But we write the new instructions to a congruently memory are (same physical memory like the kernel code), then flush/invalidate the D-Cache, and finally flush the I-cache of kernel code memory. See last function call of __patch_text_multiple(). So, logically I think it should work (and it does on PA2.x). Or do you mean to flush the I-Cache of both mappings? Helge > > Dave > > On 2021-10-31 5:14 p.m., Helge Deller wrote: >> On PA1.x machines it's not sufficient to just flush the data and >> instruction caches when we have written new instruction codes into the >> parallel mapped memory segment, but we really need to invalidate (purge) >> the cache too. Otherwise the processor will still execute the old >> instructions which are still in the data/instruction cache. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") >> Cc: stable@vger.kernel.org # v5.3+ >> >> diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c >> index 80a0ab372802..8cbb7e1d5a2b 100644 >> --- a/arch/parisc/kernel/patch.c >> +++ b/arch/parisc/kernel/patch.c >> @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >> * We're crossing a page boundary, so >> * need to remap >> */ >> - flush_kernel_vmap_range((void *)fixmap, >> + invalidate_kernel_vmap_range((void *)fixmap, >> (p-fixmap) * sizeof(*p)); >> if (mapped) >> patch_unmap(FIX_TEXT_POKE0, &flags); >> @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >> } >> } >> >> - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >> + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >> if (mapped) >> patch_unmap(FIX_TEXT_POKE0, &flags); >> + invalidate_kernel_vmap_range((void *)start, end - start); >> flush_icache_range(start, end); ^^ HERE >> } >> > >
On 2021-11-03 5:08 p.m., Helge Deller wrote: > Hi Dave, > > On 11/3/21 21:12, John David Anglin wrote: >> I think the real problem is that neither flush_kernel_vmap_range() or >> invalidate_kernel_vmap_range() flush the icache. They only operate >> on the data cache. flush_icache_range will flush both caches. > Yes. > But we write the new instructions to a congruently memory are (same > physical memory like the kernel code), then flush/invalidate the > D-Cache, and finally flush the I-cache of kernel code memory. > See last function call of __patch_text_multiple(). > > So, logically I think it should work (and it does on PA2.x). I still believe it is incorrect to use invalidate_kernel_vmap_range() to flush the writes in __patch_text_multiple() to memory. Both the PA 1.1 and 2.0 architecture documents state that it is implementation dependent whether pdc writes back dirty lines to memory at priority 0. If the writes are not flushed to memory, the patch won't install. > > Or do you mean to flush the I-Cache of both mappings? I reviewed the flush operations in __patch_text_multiple(). I believe the current code is more or less correct, but not optimal. I believe the final call to flush_icache_range() is unnecessary. We can also eliminate one range flush in the calls to make sure we don't have any aliases in the cache. See change attached below. The big problem with __patch_text_multiple() is can only patch code that the patch code doesn't depend on. This line in __patch_text_multiple() will cause a TLB fault on the first execution after a new patch_map is setup. We do alternative patching to the TLB handler when ALT_COND_NO_SMP is true. For example, .macro ptl_lock spc,ptp,pte,tmp,tmp1,fault #ifdef CONFIG_TLB_PTLOCK 98: cmpib,COND(=),n 0,\spc,2f get_ptl \tmp 1: LDCW 0(\tmp),\tmp1 cmpib,COND(=) 0,\tmp1,1b nop LDREG 0(\ptp),\pte bb,<,n \pte,_PAGE_PRESENT_BIT,3f b \fault stw \spc,0(\tmp) 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) #endif 2: LDREG 0(\ptp),\pte bb,>=,n \pte,_PAGE_PRESENT_BIT,\fault 3: .endm If the region being patched spans a page boundary, we will get a TLB fault with partially patched TLB code. I suspect something like this is the real issue with the fixmap code. Maybe this could be avoided by patching in a "b,n 99f" instruction at 98b. That would avoid patching in multiple nop instructions. Dave > > Helge > > >> >> Dave >> >> On 2021-10-31 5:14 p.m., Helge Deller wrote: >>> On PA1.x machines it's not sufficient to just flush the data and >>> instruction caches when we have written new instruction codes into the >>> parallel mapped memory segment, but we really need to invalidate (purge) >>> the cache too. Otherwise the processor will still execute the old >>> instructions which are still in the data/instruction cache. >>> >>> Signed-off-by: Helge Deller <deller@gmx.de> >>> Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") >>> Cc: stable@vger.kernel.org # v5.3+ >>> >>> diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c >>> index 80a0ab372802..8cbb7e1d5a2b 100644 >>> --- a/arch/parisc/kernel/patch.c >>> +++ b/arch/parisc/kernel/patch.c >>> @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>> * We're crossing a page boundary, so >>> * need to remap >>> */ >>> - flush_kernel_vmap_range((void *)fixmap, >>> + invalidate_kernel_vmap_range((void *)fixmap, >>> (p-fixmap) * sizeof(*p)); >>> if (mapped) >>> patch_unmap(FIX_TEXT_POKE0, &flags); >>> @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>> } >>> } >>> >>> - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>> + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>> if (mapped) >>> patch_unmap(FIX_TEXT_POKE0, &flags); >>> + invalidate_kernel_vmap_range((void *)start, end - start); >>> flush_icache_range(start, end); > --- diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c index 80a0ab372802..799795bc4210 100644 --- a/arch/parisc/kernel/patch.c +++ b/arch/parisc/kernel/patch.c @@ -67,8 +67,8 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) int mapped; /* Make sure we don't have any aliases in cache */ - flush_kernel_vmap_range(addr, len); flush_icache_range(start, end); + flush_tlb_kernel_range(start, end); p = fixmap = patch_map(addr, FIX_TEXT_POKE0, &flags, &mapped); @@ -93,7 +93,6 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); if (mapped) patch_unmap(FIX_TEXT_POKE0, &flags); - flush_icache_range(start, end); } void __kprobes __patch_text(void *addr, u32 insn)
On 2021-11-04 1:57 p.m., John David Anglin wrote: > On 2021-11-03 5:08 p.m., Helge Deller wrote: >> Hi Dave, >> >> On 11/3/21 21:12, John David Anglin wrote: >>> I think the real problem is that neither flush_kernel_vmap_range() or >>> invalidate_kernel_vmap_range() flush the icache. They only operate >>> on the data cache. flush_icache_range will flush both caches. >> Yes. >> But we write the new instructions to a congruently memory are (same >> physical memory like the kernel code), then flush/invalidate the >> D-Cache, and finally flush the I-cache of kernel code memory. >> See last function call of __patch_text_multiple(). >> >> So, logically I think it should work (and it does on PA2.x). > > I still believe it is incorrect to use invalidate_kernel_vmap_range() to flush the writes in > __patch_text_multiple() to memory. Both the PA 1.1 and 2.0 architecture documents state that > it is implementation dependent whether pdc writes back dirty lines to memory at priority 0. > If the writes are not flushed to memory, the patch won't install. > >> >> Or do you mean to flush the I-Cache of both mappings? > > I reviewed the flush operations in __patch_text_multiple(). I believe the current code is more or > less correct, but not optimal. I believe the final call to flush_icache_range() is unnecessary. We > can also eliminate one range flush in the calls to make sure we don't have any aliases in the cache. > See change attached below. > > The big problem with __patch_text_multiple() is can only patch code that the patch code doesn't depend > on. This line in __patch_text_multiple() will cause a TLB fault on the first execution after a new *p++ = *insn++; > patch_map is setup. We do alternative patching to the TLB handler when ALT_COND_NO_SMP is true. For > example, > > .macro ptl_lock spc,ptp,pte,tmp,tmp1,fault > #ifdef CONFIG_TLB_PTLOCK > 98: cmpib,COND(=),n 0,\spc,2f > get_ptl \tmp > 1: LDCW 0(\tmp),\tmp1 > cmpib,COND(=) 0,\tmp1,1b > nop > LDREG 0(\ptp),\pte > bb,<,n \pte,_PAGE_PRESENT_BIT,3f > b \fault > stw \spc,0(\tmp) > 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) > #endif > 2: LDREG 0(\ptp),\pte > bb,>=,n \pte,_PAGE_PRESENT_BIT,\fault > 3: > .endm > > If the region being patched spans a page boundary, we will get a TLB fault with partially patched TLB > code. I suspect something like this is the real issue with the fixmap code. > > Maybe this could be avoided by patching in a "b,n 99f" instruction at 98b. That would avoid patching > in multiple nop instructions. > > Dave > >> >> Helge >> >> >>> >>> Dave >>> >>> On 2021-10-31 5:14 p.m., Helge Deller wrote: >>>> On PA1.x machines it's not sufficient to just flush the data and >>>> instruction caches when we have written new instruction codes into the >>>> parallel mapped memory segment, but we really need to invalidate (purge) >>>> the cache too. Otherwise the processor will still execute the old >>>> instructions which are still in the data/instruction cache. >>>> >>>> Signed-off-by: Helge Deller <deller@gmx.de> >>>> Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") >>>> Cc: stable@vger.kernel.org # v5.3+ >>>> >>>> diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c >>>> index 80a0ab372802..8cbb7e1d5a2b 100644 >>>> --- a/arch/parisc/kernel/patch.c >>>> +++ b/arch/parisc/kernel/patch.c >>>> @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>>> * We're crossing a page boundary, so >>>> * need to remap >>>> */ >>>> - flush_kernel_vmap_range((void *)fixmap, >>>> + invalidate_kernel_vmap_range((void *)fixmap, >>>> (p-fixmap) * sizeof(*p)); >>>> if (mapped) >>>> patch_unmap(FIX_TEXT_POKE0, &flags); >>>> @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>>> } >>>> } >>>> >>>> - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>>> + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>>> if (mapped) >>>> patch_unmap(FIX_TEXT_POKE0, &flags); >>>> + invalidate_kernel_vmap_range((void *)start, end - start); >>>> flush_icache_range(start, end); >> > > --- > diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c > index 80a0ab372802..799795bc4210 100644 > --- a/arch/parisc/kernel/patch.c > +++ b/arch/parisc/kernel/patch.c > @@ -67,8 +67,8 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > int mapped; > > /* Make sure we don't have any aliases in cache */ > - flush_kernel_vmap_range(addr, len); > flush_icache_range(start, end); > + flush_tlb_kernel_range(start, end); > > p = fixmap = patch_map(addr, FIX_TEXT_POKE0, &flags, &mapped); > > @@ -93,7 +93,6 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); > if (mapped) > patch_unmap(FIX_TEXT_POKE0, &flags); > - flush_icache_range(start, end); > } > > void __kprobes __patch_text(void *addr, u32 insn)
Hi Dave, On 11/4/21 18:57, John David Anglin wrote: > On 2021-11-03 5:08 p.m., Helge Deller wrote: >> On 11/3/21 21:12, John David Anglin wrote: >>> I think the real problem is that neither flush_kernel_vmap_range() or >>> invalidate_kernel_vmap_range() flush the icache. They only operate >>> on the data cache. flush_icache_range will flush both caches. >> Yes. >> But we write the new instructions to a congruently memory are (same >> physical memory like the kernel code), then flush/invalidate the >> D-Cache, and finally flush the I-cache of kernel code memory. >> See last function call of __patch_text_multiple(). >> >> So, logically I think it should work (and it does on PA2.x). > > I still believe it is incorrect to use invalidate_kernel_vmap_range() to flush the writes in > __patch_text_multiple() to memory. Both the PA 1.1 and 2.0 architecture documents state that > it is implementation dependent whether pdc writes back dirty lines to memory at priority 0. > If the writes are not flushed to memory, the patch won't install. > >> Or do you mean to flush the I-Cache of both mappings? > > I reviewed the flush operations in __patch_text_multiple(). I believe the current code is more or > less correct, but not optimal. I believe the final call to flush_icache_range() is unnecessary. We > can also eliminate one range flush in the calls to make sure we don't have any aliases in the cache. > See change attached below. > > The big problem with __patch_text_multiple() is can only patch code that the patch code doesn't depend > on. This line in __patch_text_multiple() will cause a TLB fault on the first execution after a new > patch_map is setup. We do alternative patching to the TLB handler when ALT_COND_NO_SMP is true. I believe the alternative code patching isn't critical. It just patches in NOPs, so even if the newly patched NOPs aren't visible yet (when the TLB handler is executed), it uses the old instructions which shouldn't harm anything. They were executed before the whole time. I do agree, that patching other code paths (with non-NOPs) could be critial. By the way, to narrow down the problem, I do boot those tests here with the "no-alternatives" kernel parameter, which effectively disables the alternatives-patching. > For example, > > .macro ptl_lock spc,ptp,pte,tmp,tmp1,fault > #ifdef CONFIG_TLB_PTLOCK > 98: cmpib,COND(=),n 0,\spc,2f > get_ptl \tmp > 1: LDCW 0(\tmp),\tmp1 > cmpib,COND(=) 0,\tmp1,1b > nop > LDREG 0(\ptp),\pte > bb,<,n \pte,_PAGE_PRESENT_BIT,3f > b \fault > stw \spc,0(\tmp) > 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) > #endif > 2: LDREG 0(\ptp),\pte > bb,>=,n \pte,_PAGE_PRESENT_BIT,\fault > 3: > .endm > > If the region being patched spans a page boundary, we will get a TLB fault with partially patched TLB > code. I suspect something like this is the real issue with the fixmap code. > > Maybe this could be avoided by patching in a "b,n 99f" instruction at 98b. That would avoid patching > in multiple nop instructions. It's already being done that way. See alternative.c: /* * Replace instruction with NOPs? * For long distance insert a branch instruction instead. */ if (replacement == INSN_NOP && len > 1) replacement = 0xe8000002 + (len-2)*8; /* "b,n .+8" */ >>> On 2021-10-31 5:14 p.m., Helge Deller wrote: >>>> On PA1.x machines it's not sufficient to just flush the data and >>>> instruction caches when we have written new instruction codes into the >>>> parallel mapped memory segment, but we really need to invalidate (purge) >>>> the cache too. Otherwise the processor will still execute the old >>>> instructions which are still in the data/instruction cache. >>>> >>>> Signed-off-by: Helge Deller <deller@gmx.de> >>>> Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") >>>> Cc: stable@vger.kernel.org # v5.3+ >>>> >>>> diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c >>>> index 80a0ab372802..8cbb7e1d5a2b 100644 >>>> --- a/arch/parisc/kernel/patch.c >>>> +++ b/arch/parisc/kernel/patch.c >>>> @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>>> * We're crossing a page boundary, so >>>> * need to remap >>>> */ >>>> - flush_kernel_vmap_range((void *)fixmap, >>>> + invalidate_kernel_vmap_range((void *)fixmap, >>>> (p-fixmap) * sizeof(*p)); >>>> if (mapped) >>>> patch_unmap(FIX_TEXT_POKE0, &flags); >>>> @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>>> } >>>> } >>>> >>>> - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>>> + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>>> if (mapped) >>>> patch_unmap(FIX_TEXT_POKE0, &flags); >>>> + invalidate_kernel_vmap_range((void *)start, end - start); >>>> flush_icache_range(start, end); >> > > --- > diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c > index 80a0ab372802..799795bc4210 100644 > --- a/arch/parisc/kernel/patch.c > +++ b/arch/parisc/kernel/patch.c > @@ -67,8 +67,8 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > int mapped; > > /* Make sure we don't have any aliases in cache */ > - flush_kernel_vmap_range(addr, len); > flush_icache_range(start, end); > + flush_tlb_kernel_range(start, end); > > p = fixmap = patch_map(addr, FIX_TEXT_POKE0, &flags, &mapped); > > @@ -93,7 +93,6 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); > if (mapped) > patch_unmap(FIX_TEXT_POKE0, &flags); > - flush_icache_range(start, end); > } > > void __kprobes __patch_text(void *addr, u32 insn) Here is the syslog with your patch on the 715/64: ... Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) mem auto-init: stack:off, heap alloc:off, heap free:off Memory: 148832K/163840K available (8452K kernel code, 1461K rwdata, 2379K rodata, 696K init, 444K bss, 15008K reserved, 0K cma-reserved) ftrace: allocating 26912 entries in 53 pages Backtrace: [<1097d588>] __patch_text+0x20/0x30 [<101c5128>] ftrace_make_nop+0x3c/0x68 [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 [<10113ba0>] ftrace_init+0xa8/0x154 [<10101284>] start_kernel+0x3d0/0x708 [<10105244>] start_parisc+0xb8/0xec Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0effe310 CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-32bit+ #1017 Hardware name: 9000/715 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00000000000011000000001100001110 Not tainted r00-03 000c030e 10c7cda8 1097d474 10c67340 r04-07 10100314 00000000 10c672dc 10c7d328 r08-11 00000000 00000000 10cbe5a8 00000000 r12-15 10c75da8 10b2f800 10d7d5a8 00000001 r16-19 00000000 00000002 00000000 08000240 r20-23 10c67000 00000200 10100500 00000020 r24-27 0efff000 0efff000 0effe310 10b915a8 r28-31 0effe310 00000000 10c673c0 10279a68 sr00-03 00000000 00000000 00000000 00000000 sr04-07 00000000 00000000 00000000 00000000 IASQ: 00000000 00000000 IAOQ: 1097d4e4 1097d480 IIR: 0f9312a8 ISR: 00000000 IOR: 0effe310 CPU: 0 CR30: 10c67000 CR31: f00effff ORIG_R28: 00000000 IAOQ[0]: __patch_text_multiple+0xdc/0x12c IAOQ[1]: __patch_text_multiple+0x78/0x12c RP(r2): __patch_text_multiple+0x6c/0x12c Backtrace: [<1097d588>] __patch_text+0x20/0x30 [<101c5128>] ftrace_make_nop+0x3c/0x68 [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 [<10113ba0>] ftrace_init+0xa8/0x154 [<10101284>] start_kernel+0x3d0/0x708 [<10105244>] start_parisc+0xb8/0xec Kernel panic - not syncing: Bad Address (null pointer deref?) Helge
On 2021-11-04 4:48 p.m., Helge Deller wrote: > Here is the syslog with your patch on the 715/64: > ... > Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) > mem auto-init: stack:off, heap alloc:off, heap free:off > Memory: 148832K/163840K available (8452K kernel code, 1461K rwdata, 2379K rodata, 696K init, 444K bss, 15008K reserved, 0K cma-reserved) > ftrace: allocating 26912 entries in 53 pages > Backtrace: > [<1097d588>] __patch_text+0x20/0x30 > [<101c5128>] ftrace_make_nop+0x3c/0x68 > [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 > [<10113ba0>] ftrace_init+0xa8/0x154 > [<10101284>] start_kernel+0x3d0/0x708 > [<10105244>] start_parisc+0xb8/0xec > > Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0effe310 > CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-32bit+ #1017 > Hardware name: 9000/715 > > YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI > PSW: 00000000000011000000001100001110 Not tainted > r00-03 000c030e 10c7cda8 1097d474 10c67340 > r04-07 10100314 00000000 10c672dc 10c7d328 > r08-11 00000000 00000000 10cbe5a8 00000000 > r12-15 10c75da8 10b2f800 10d7d5a8 00000001 > r16-19 00000000 00000002 00000000 08000240 > r20-23 10c67000 00000200 10100500 00000020 > r24-27 0efff000 0efff000 0effe310 10b915a8 > r28-31 0effe310 00000000 10c673c0 10279a68 > sr00-03 00000000 00000000 00000000 00000000 > sr04-07 00000000 00000000 00000000 00000000 > > IASQ: 00000000 00000000 IAOQ: 1097d4e4 1097d480 > IIR: 0f9312a8 ISR: 00000000 IOR: 0effe310 > CPU: 0 CR30: 10c67000 CR31: f00effff > ORIG_R28: 00000000 > IAOQ[0]: __patch_text_multiple+0xdc/0x12c > IAOQ[1]: __patch_text_multiple+0x78/0x12c > RP(r2): __patch_text_multiple+0x6c/0x12c > Backtrace: > [<1097d588>] __patch_text+0x20/0x30 > [<101c5128>] ftrace_make_nop+0x3c/0x68 > [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 > [<10113ba0>] ftrace_init+0xa8/0x154 > [<10101284>] start_kernel+0x3d0/0x708 > [<10105244>] start_parisc+0xb8/0xec > Kernel panic - not syncing: Bad Address (null pointer deref?) The faulting instruction is "stw,ma r19,4(ret0)". r19 contains nop instruction. r28 contains the address being patched. Where does it (0x0effe310) point in kernel? It seems likely to me that TLB handler is broken and that's the reason for the bad address fault. But maybe there's a map problem. Dave
On 2021-11-04 4:48 p.m., Helge Deller wrote: > I believe the alternative code patching isn't critical. > It just patches in NOPs, so even if the newly patched NOPs aren't visible yet (when the TLB handler is > executed), it uses the old instructions which shouldn't harm anything. They were executed before the > whole time. But we flush the data cache when we hit a page boundary. This makes these NOPs visible. Maybe we should do both flushes after all words have been written. Dave
Hi Dave, On 11/4/21 22:16, John David Anglin wrote: > On 2021-11-04 4:48 p.m., Helge Deller wrote: >> Here is the syslog with your patch on the 715/64: >> ... >> Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) >> mem auto-init: stack:off, heap alloc:off, heap free:off >> Memory: 148832K/163840K available (8452K kernel code, 1461K rwdata, 2379K rodata, 696K init, 444K bss, 15008K reserved, 0K cma-reserved) >> ftrace: allocating 26912 entries in 53 pages >> Backtrace: >> [<1097d588>] __patch_text+0x20/0x30 >> [<101c5128>] ftrace_make_nop+0x3c/0x68 >> [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 >> [<10113ba0>] ftrace_init+0xa8/0x154 >> [<10101284>] start_kernel+0x3d0/0x708 >> [<10105244>] start_parisc+0xb8/0xec >> >> Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0effe310 >> CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-32bit+ #1017 >> Hardware name: 9000/715 >> >> YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI >> PSW: 00000000000011000000001100001110 Not tainted >> r00-03 000c030e 10c7cda8 1097d474 10c67340 >> r04-07 10100314 00000000 10c672dc 10c7d328 >> r08-11 00000000 00000000 10cbe5a8 00000000 >> r12-15 10c75da8 10b2f800 10d7d5a8 00000001 >> r16-19 00000000 00000002 00000000 08000240 >> r20-23 10c67000 00000200 10100500 00000020 >> r24-27 0efff000 0efff000 0effe310 10b915a8 >> r28-31 0effe310 00000000 10c673c0 10279a68 >> sr00-03 00000000 00000000 00000000 00000000 >> sr04-07 00000000 00000000 00000000 00000000 >> >> IASQ: 00000000 00000000 IAOQ: 1097d4e4 1097d480 >> IIR: 0f9312a8 ISR: 00000000 IOR: 0effe310 >> CPU: 0 CR30: 10c67000 CR31: f00effff >> ORIG_R28: 00000000 >> IAOQ[0]: __patch_text_multiple+0xdc/0x12c >> IAOQ[1]: __patch_text_multiple+0x78/0x12c >> RP(r2): __patch_text_multiple+0x6c/0x12c >> Backtrace: >> [<1097d588>] __patch_text+0x20/0x30 >> [<101c5128>] ftrace_make_nop+0x3c/0x68 >> [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 >> [<10113ba0>] ftrace_init+0xa8/0x154 >> [<10101284>] start_kernel+0x3d0/0x708 >> [<10105244>] start_parisc+0xb8/0xec >> Kernel panic - not syncing: Bad Address (null pointer deref?) > The faulting instruction is "stw,ma r19,4(ret0)". r19 contains nop instruction. r28 > contains the address being patched. Where does it (0x0effe310) point in kernel? > > It seems likely to me that TLB handler is broken and that's the reason for the bad address fault. > But maybe there's a map problem. Right, there is a map problem! Please ignore this last backtrace I sent. I missed to include my previous patch to fix the TLB issue, aka this patch: https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-5.16-1&id=6e866a462867b60841202e900f10936a0478608c So, this backtrace is bogus... With my patch (and yours) applied now it comes further. Maybe there is generic ftrace/kprobe problem in kernel v5.15 into which I run? I'll continue testing on git head from Linus and report back soon... Helge
Hi Helge, On 2021-11-04 5:27 p.m., Helge Deller wrote: > Right, there is a map problem! > Please ignore this last backtrace I sent. > I missed to include my previous patch to fix the TLB issue, aka this patch: > https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/commit/?h=parisc-5.16-1&id=6e866a462867b60841202e900f10936a0478608c I have this change installed in my tree. rp3440 boots okay with my patch but I don't have CONFIG_KPROBES set. Dave
Helge Deller <deller@gmx.de> writes: > Hi Dave, > > On 11/4/21 18:57, John David Anglin wrote: >> On 2021-11-03 5:08 p.m., Helge Deller wrote: >>> On 11/3/21 21:12, John David Anglin wrote: >>>> I think the real problem is that neither flush_kernel_vmap_range() or >>>> invalidate_kernel_vmap_range() flush the icache. They only operate >>>> on the data cache. flush_icache_range will flush both caches. >>> Yes. >>> But we write the new instructions to a congruently memory are (same >>> physical memory like the kernel code), then flush/invalidate the >>> D-Cache, and finally flush the I-cache of kernel code memory. >>> See last function call of __patch_text_multiple(). >>> >>> So, logically I think it should work (and it does on PA2.x). >> >> I still believe it is incorrect to use invalidate_kernel_vmap_range() to flush the writes in >> __patch_text_multiple() to memory. Both the PA 1.1 and 2.0 architecture documents state that >> it is implementation dependent whether pdc writes back dirty lines to memory at priority 0. >> If the writes are not flushed to memory, the patch won't install. >> >>> Or do you mean to flush the I-Cache of both mappings? >> >> I reviewed the flush operations in __patch_text_multiple(). I believe the current code is more or >> less correct, but not optimal. I believe the final call to flush_icache_range() is unnecessary. We >> can also eliminate one range flush in the calls to make sure we don't have any aliases in the cache. >> See change attached below. >> >> The big problem with __patch_text_multiple() is can only patch code that the patch code doesn't depend >> on. This line in __patch_text_multiple() will cause a TLB fault on the first execution after a new >> patch_map is setup. We do alternative patching to the TLB handler when ALT_COND_NO_SMP is true. > > I believe the alternative code patching isn't critical. > It just patches in NOPs, so even if the newly patched NOPs aren't visible yet (when the TLB handler is > executed), it uses the old instructions which shouldn't harm anything. They were executed before the > whole time. > > I do agree, that patching other code paths (with non-NOPs) could be critial. > > By the way, to narrow down the problem, I do boot those tests here with the "no-alternatives" > kernel parameter, which effectively disables the alternatives-patching. > I think there are two paths in code patching, that are mixed up in the text above: a) __patch_text_multiple(), which is only used by ftrace (and kprobes on ftrace). This patches the functions as follows: - it patches the trampoline located before the function start - after doing so, it patches the branch which sits right at the beginning of the function. When removing, it first removes the branch and the trampoline code afterwards. This is done via two calls to __patch_text_multiple(), which isn't ideal. See ftrace_make_nop()/ftrace_make_call() b) The alternative patching, which replaces code with memcpy(), which might suffer from the tlb problem mentioned above. Sven
diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c index 80a0ab372802..8cbb7e1d5a2b 100644 --- a/arch/parisc/kernel/patch.c +++ b/arch/parisc/kernel/patch.c @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) * We're crossing a page boundary, so * need to remap */ - flush_kernel_vmap_range((void *)fixmap, + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); if (mapped) patch_unmap(FIX_TEXT_POKE0, &flags); @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) } } - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); if (mapped) patch_unmap(FIX_TEXT_POKE0, &flags); + invalidate_kernel_vmap_range((void *)start, end - start); flush_icache_range(start, end); }
On PA1.x machines it's not sufficient to just flush the data and instruction caches when we have written new instruction codes into the parallel mapped memory segment, but we really need to invalidate (purge) the cache too. Otherwise the processor will still execute the old instructions which are still in the data/instruction cache. Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") Cc: stable@vger.kernel.org # v5.3+