From patchwork Tue Oct 16 20:38:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Helge Deller X-Patchwork-Id: 10644171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7F65E112B for ; Tue, 16 Oct 2018 20:38:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6787C2A809 for ; Tue, 16 Oct 2018 20:38:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5BC532A887; Tue, 16 Oct 2018 20:38:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,FREEMAIL_FROM, MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DCEE32A885 for ; Tue, 16 Oct 2018 20:38:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726429AbeJQEar (ORCPT ); Wed, 17 Oct 2018 00:30:47 -0400 Received: from mout.gmx.net ([212.227.15.18]:58777 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726325AbeJQEar (ORCPT ); Wed, 17 Oct 2018 00:30:47 -0400 Received: from ls3530.fritz.box ([92.116.157.151]) by mail.gmx.com (mrgmx002 [212.227.17.190]) with ESMTPSA (Nemesis) id 0MSv6D-1g2r3c08cH-00RpZW; Tue, 16 Oct 2018 22:38:25 +0200 Received: from ls3530.fritz.box ([92.116.157.151]) by mail.gmx.com (mrgmx002 [212.227.17.190]) with ESMTPSA (Nemesis) id 0MSv6D-1g2r3c08cH-00RpZW; Tue, 16 Oct 2018 22:38:25 +0200 Date: Tue, 16 Oct 2018 22:38:22 +0200 From: Helge Deller To: linux-parisc@vger.kernel.org, James Bottomley , John David Anglin Subject: [RFC][PATCH v3] parisc: Add alternative coding infrastructure Message-ID: <20181016203822.GA8350@ls3530.fritz.box> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.9.1 (2017-09-22) X-Provags-ID: V03:K1:0TvvCCrAHZKHU3dtKb+tpJpF1RMT6AAmegUTX4Wl/cWfScXqLQU Or4FUDleLLBy7HMsgf8sgsYqx4/RThgwMMmr8RXlkKrz8j3P47oczCTBT/iaOUYhruWZTIk ba6e79NyL4ok5zxgY9x0aeELv9q5pyoUN6o6BPa6BkBSkIgLmDNaW0XzOl5ji1IUOz8aZSA dWWymVENLM0t/isNphPWg== X-UI-Out-Filterresults: notjunk:1;V01:K0:MQOgFiO7zMg=:bnTMvjxARi6DSshxXQxVal FdAtqc0/hkOdC4N+qJGd7Ae5/A1Ukh1/P9pXb98Nn19/jw1xu3XMlSP+4WaSLQuW2RhgEsJ3m sD+OMzZoq4tMSmlJ97axOeve61idDVa+YQxYzTEyhdp1Gubl/XYmY93y7FCDaqkfc2VFW3ZC/ 4I1KSHyJcgFgf+WOZuRRwCERpgAxk/uhM+pR7XdGN9rljqRyZndxRQ4zZXtvv4r0jJTJF7zmL Bl9yHw3LO3a1f+7XKmylX1+LHeMP+R3bSfezJiGCwu0LlaYAJRBVh6v/CGBMzyW2JRiGMqc/0 dfpk+mMKLWhl4yWhO2lSby9IeJOZYJ+58k0Yym9KkwtC1K3TY4r4lUR4AT/M4NYNYu/nlqyoT e5fxlPt9CyaROyh4Ma6qPhP97Ly/u23y5dBo1EdxTTI5YEyQAmMhYMy+66rRX0z66/tT611JS TIeXHSh3DKD3v3IlAcueGFpXX82eFhBSFFcxjPjEDaTXuNUs1SRsr4NC81vtm8k2y3Zufn8Qe JfL4GtiaKeEvyfaisEikNnNKAaTc+VUh7aPwROjS27wPbe4h1h/FlcXFB4BMw0R8sDMtkv6UQ VK8Duay48kLHCC1Xy/XWy1WF2rFRiUPJaq0M+dehYtdWq8blLOca2lrHv1O+BO5jVNY68ynkd uTRniHi/F9sweMDhvV9hPMkLl+Vinc3KZ8rKR/G1Ur2slwR0rMQJ34Q2le3thUUT7NKdU6xaB lPqusmRYbO0DTzhWCfUDMJbnQmBrbwo2MFcvFKo2DAioFd1YIFF8duCHRyHyWacWqJbtZzEN6 b5rARxmr4xHTKs/4XQ+fuiwhqojHQ== Sender: linux-parisc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds the necessary code to patch a running kernel at runtime to improve performance. The current implementation offers a few optimizations variants: - When running a SMP kernel on a single UP processor, unwanted assembler statements like locking functions are overwritten with NOPs. When multiple instructions shall be skipped, one branch instruction is used instead of multiple nop instructions. - In the UP case, some pdtlb and pitlb instructions are patched to become pdtlb,l and pitlb,l which only flushes the CPU-local tlb entries instead of broadcasting the flush to other CPUs in the system and thus may improve performance. - fic and fdc instructions are skipped if no I- or D-caches are installed. This should speed up qemu emulation and cacheless systems. - If no cache coherence is needed for IO operations, the relevant fdc and sync instructions in the sba and ccio drivers are replaced by nops. - On systems which share I- and D-TLBs and thus don't have a seperate instruction TLB, the pitlb instruction is replaced by a nop. Live-patching is done early in the boot process, just after having run the system inventory. No drivers are running and thus no external interrupts should arrive. So the hope is that no TLB exceptions will occur during the patching. If this turns out to be wrong we will probably need to do the patching in real-mode. Signed-off-by: Helge Deller diff --git a/arch/parisc/include/asm/cache.h b/arch/parisc/include/asm/cache.h index 150b7f30ea90..006fb939cac8 100644 --- a/arch/parisc/include/asm/cache.h +++ b/arch/parisc/include/asm/cache.h @@ -6,6 +6,7 @@ #ifndef __ARCH_PARISC_CACHE_H #define __ARCH_PARISC_CACHE_H +#include /* * PA 2.0 processors have 64 and 128-byte L2 cachelines; PA 1.1 processors @@ -41,9 +42,24 @@ extern int icache_stride; extern struct pdc_cache_info cache_info; void parisc_setup_cache_timing(void); -#define pdtlb(addr) asm volatile("pdtlb 0(%%sr1,%0)" : : "r" (addr)); -#define pitlb(addr) asm volatile("pitlb 0(%%sr1,%0)" : : "r" (addr)); -#define pdtlb_kernel(addr) asm volatile("pdtlb 0(%0)" : : "r" (addr)); +#define pdtlb(addr) asm volatile("pdtlb 0(%%sr1,%0)" \ + ALTERNATIVE(ALT_COND_NO_SMP, INSN_PxTLB) \ + : : "r" (addr)) +#define pitlb(addr) asm volatile("pitlb 0(%%sr1,%0)" \ + ALTERNATIVE(ALT_COND_NO_SMP, INSN_PxTLB) \ + ALTERNATIVE(ALT_COND_NO_SPLIT_TLB, INSN_NOP) \ + : : "r" (addr)) +#define pdtlb_kernel(addr) asm volatile("pdtlb 0(%0)" \ + ALTERNATIVE(ALT_COND_NO_SMP, INSN_PxTLB) \ + : : "r" (addr)) + +#define asm_io_fdc(addr) asm volatile("fdc %%r0(%0)" \ + ALTERNATIVE(ALT_COND_NO_DCACHE, INSN_NOP) \ + ALTERNATIVE(ALT_COND_NO_IOC_FDC, INSN_NOP) \ + : : "r" (addr)) +#define asm_io_sync() asm volatile("sync" \ + ALTERNATIVE(ALT_COND_NO_DCACHE, INSN_NOP) \ + ALTERNATIVE(ALT_COND_NO_IOC_FDC, INSN_NOP) :: ) #endif /* ! __ASSEMBLY__ */ diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h index b86c31291f0a..94c0ef7a9e03 100644 --- a/arch/parisc/include/asm/pgtable.h +++ b/arch/parisc/include/asm/pgtable.h @@ -43,8 +43,7 @@ static inline void purge_tlb_entries(struct mm_struct *mm, unsigned long addr) { mtsp(mm->context, 1); pdtlb(addr); - if (unlikely(split_tlb)) - pitlb(addr); + pitlb(addr); } /* Certain architectures need to do special things when PTEs diff --git a/arch/parisc/include/asm/sections.h b/arch/parisc/include/asm/sections.h index 5a40b51df80c..bb52aea0cb21 100644 --- a/arch/parisc/include/asm/sections.h +++ b/arch/parisc/include/asm/sections.h @@ -5,6 +5,8 @@ /* nothing to see, move along */ #include +extern char __alt_instructions[], __alt_instructions_end[]; + #ifdef CONFIG_64BIT #define HAVE_DEREFERENCE_FUNCTION_DESCRIPTOR 1 diff --git a/arch/parisc/include/asm/tlbflush.h b/arch/parisc/include/asm/tlbflush.h index 14668bd52d60..6804374efa66 100644 --- a/arch/parisc/include/asm/tlbflush.h +++ b/arch/parisc/include/asm/tlbflush.h @@ -85,8 +85,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, purge_tlb_start(flags); mtsp(sid, 1); pdtlb(addr); - if (unlikely(split_tlb)) - pitlb(addr); + pitlb(addr); purge_tlb_end(flags); } #endif diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c index 4209b74ce63c..789ae96b0dbb 100644 --- a/arch/parisc/kernel/cache.c +++ b/arch/parisc/kernel/cache.c @@ -479,18 +479,6 @@ int __flush_tlb_range(unsigned long sid, unsigned long start, /* Purge TLB entries for small ranges using the pdtlb and pitlb instructions. These instructions execute locally but cause a purge request to be broadcast to other TLBs. */ - if (likely(!split_tlb)) { - while (start < end) { - purge_tlb_start(flags); - mtsp(sid, 1); - pdtlb(start); - purge_tlb_end(flags); - start += PAGE_SIZE; - } - return 0; - } - - /* split TLB case */ while (start < end) { purge_tlb_start(flags); mtsp(sid, 1); diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S index 7c85a91b4710..d3e2633cd688 100644 --- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -38,6 +38,7 @@ #include #include #include +#include #include @@ -464,7 +465,7 @@ /* Acquire pa_tlb_lock lock and check page is present. */ .macro tlb_lock spc,ptp,pte,tmp,tmp1,fault #ifdef CONFIG_SMP - cmpib,COND(=),n 0,\spc,2f +98: cmpib,COND(=),n 0,\spc,2f load_pa_tlb_lock \tmp 1: LDCW 0(\tmp),\tmp1 cmpib,COND(=) 0,\tmp1,1b @@ -473,6 +474,7 @@ bb,<,n \pte,_PAGE_PRESENT_BIT,3f b \fault stw,ma \spc,0(\tmp) +99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) #endif 2: LDREG 0(\ptp),\pte bb,>=,n \pte,_PAGE_PRESENT_BIT,\fault @@ -482,15 +484,17 @@ /* Release pa_tlb_lock lock without reloading lock address. */ .macro tlb_unlock0 spc,tmp #ifdef CONFIG_SMP - or,COND(=) %r0,\spc,%r0 +98: or,COND(=) %r0,\spc,%r0 stw,ma \spc,0(\tmp) +99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) #endif .endm /* Release pa_tlb_lock lock. */ .macro tlb_unlock1 spc,tmp #ifdef CONFIG_SMP - load_pa_tlb_lock \tmp +98: load_pa_tlb_lock \tmp +99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) tlb_unlock0 \spc,\tmp #endif .endm diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S index f33bf2d306d6..b41c0136a05f 100644 --- a/arch/parisc/kernel/pacache.S +++ b/arch/parisc/kernel/pacache.S @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -190,7 +191,7 @@ ENDPROC_CFI(flush_tlb_all_local) .import cache_info,data ENTRY_CFI(flush_instruction_cache_local) - load32 cache_info, %r1 +88: load32 cache_info, %r1 /* Flush Instruction Cache */ @@ -243,6 +244,7 @@ fioneloop2: fisync: sync mtsm %r22 /* restore I-bit */ +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_ICACHE, INSN_NOP) bv %r0(%r2) nop ENDPROC_CFI(flush_instruction_cache_local) @@ -250,7 +252,7 @@ ENDPROC_CFI(flush_instruction_cache_local) .import cache_info, data ENTRY_CFI(flush_data_cache_local) - load32 cache_info, %r1 +88: load32 cache_info, %r1 /* Flush Data Cache */ @@ -304,6 +306,7 @@ fdsync: syncdma sync mtsm %r22 /* restore I-bit */ +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) bv %r0(%r2) nop ENDPROC_CFI(flush_data_cache_local) @@ -312,6 +315,7 @@ ENDPROC_CFI(flush_data_cache_local) .macro tlb_lock la,flags,tmp #ifdef CONFIG_SMP +98: #if __PA_LDCW_ALIGNMENT > 4 load32 pa_tlb_lock + __PA_LDCW_ALIGNMENT-1, \la depi 0,31,__PA_LDCW_ALIGN_ORDER, \la @@ -326,15 +330,17 @@ ENDPROC_CFI(flush_data_cache_local) nop b,n 2b 3: +99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) #endif .endm .macro tlb_unlock la,flags,tmp #ifdef CONFIG_SMP - ldi 1,\tmp +98: ldi 1,\tmp sync stw \tmp,0(\la) mtsm \flags +99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) #endif .endm @@ -596,9 +602,11 @@ ENTRY_CFI(copy_user_page_asm) pdtlb,l %r0(%r29) #else tlb_lock %r20,%r21,%r22 - pdtlb %r0(%r28) - pdtlb %r0(%r29) +0: pdtlb %r0(%r28) +1: pdtlb %r0(%r29) tlb_unlock %r20,%r21,%r22 + ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB) + ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SMP, INSN_PxTLB) #endif #ifdef CONFIG_64BIT @@ -736,8 +744,9 @@ ENTRY_CFI(clear_user_page_asm) pdtlb,l %r0(%r28) #else tlb_lock %r20,%r21,%r22 - pdtlb %r0(%r28) +0: pdtlb %r0(%r28) tlb_unlock %r20,%r21,%r22 + ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB) #endif #ifdef CONFIG_64BIT @@ -813,11 +822,12 @@ ENTRY_CFI(flush_dcache_page_asm) pdtlb,l %r0(%r28) #else tlb_lock %r20,%r21,%r22 - pdtlb %r0(%r28) +0: pdtlb %r0(%r28) tlb_unlock %r20,%r21,%r22 + ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB) #endif - ldil L%dcache_stride, %r1 +88: ldil L%dcache_stride, %r1 ldw R%dcache_stride(%r1), r31 #ifdef CONFIG_64BIT @@ -847,6 +857,7 @@ ENTRY_CFI(flush_dcache_page_asm) cmpb,COND(<<) %r28, %r25,1b fdc,m r31(%r28) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) sync bv %r0(%r2) nop @@ -874,15 +885,19 @@ ENTRY_CFI(flush_icache_page_asm) #ifdef CONFIG_PA20 pdtlb,l %r0(%r28) - pitlb,l %r0(%sr4,%r28) +1: pitlb,l %r0(%sr4,%r28) + ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SPLIT_TLB, INSN_NOP) #else tlb_lock %r20,%r21,%r22 - pdtlb %r0(%r28) - pitlb %r0(%sr4,%r28) +0: pdtlb %r0(%r28) +1: pitlb %r0(%sr4,%r28) tlb_unlock %r20,%r21,%r22 + ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB) + ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SMP, INSN_PxTLB) + ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SPLIT_TLB, INSN_NOP) #endif - ldil L%icache_stride, %r1 +88: ldil L%icache_stride, %r1 ldw R%icache_stride(%r1), %r31 #ifdef CONFIG_64BIT @@ -914,13 +929,14 @@ ENTRY_CFI(flush_icache_page_asm) cmpb,COND(<<) %r28, %r25,1b fic,m %r31(%sr4,%r28) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_ICACHE, INSN_NOP) sync bv %r0(%r2) nop ENDPROC_CFI(flush_icache_page_asm) ENTRY_CFI(flush_kernel_dcache_page_asm) - ldil L%dcache_stride, %r1 +88: ldil L%dcache_stride, %r1 ldw R%dcache_stride(%r1), %r23 #ifdef CONFIG_64BIT @@ -950,13 +966,14 @@ ENTRY_CFI(flush_kernel_dcache_page_asm) cmpb,COND(<<) %r26, %r25,1b fdc,m %r23(%r26) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) sync bv %r0(%r2) nop ENDPROC_CFI(flush_kernel_dcache_page_asm) ENTRY_CFI(purge_kernel_dcache_page_asm) - ldil L%dcache_stride, %r1 +88: ldil L%dcache_stride, %r1 ldw R%dcache_stride(%r1), %r23 #ifdef CONFIG_64BIT @@ -985,13 +1002,14 @@ ENTRY_CFI(purge_kernel_dcache_page_asm) cmpb,COND(<<) %r26, %r25, 1b pdc,m %r23(%r26) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) sync bv %r0(%r2) nop ENDPROC_CFI(purge_kernel_dcache_page_asm) ENTRY_CFI(flush_user_dcache_range_asm) - ldil L%dcache_stride, %r1 +88: ldil L%dcache_stride, %r1 ldw R%dcache_stride(%r1), %r23 ldo -1(%r23), %r21 ANDCM %r26, %r21, %r26 @@ -999,13 +1017,14 @@ ENTRY_CFI(flush_user_dcache_range_asm) 1: cmpb,COND(<<),n %r26, %r25, 1b fdc,m %r23(%sr3, %r26) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) sync bv %r0(%r2) nop ENDPROC_CFI(flush_user_dcache_range_asm) ENTRY_CFI(flush_kernel_dcache_range_asm) - ldil L%dcache_stride, %r1 +88: ldil L%dcache_stride, %r1 ldw R%dcache_stride(%r1), %r23 ldo -1(%r23), %r21 ANDCM %r26, %r21, %r26 @@ -1014,13 +1033,14 @@ ENTRY_CFI(flush_kernel_dcache_range_asm) fdc,m %r23(%r26) sync +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) syncdma bv %r0(%r2) nop ENDPROC_CFI(flush_kernel_dcache_range_asm) ENTRY_CFI(purge_kernel_dcache_range_asm) - ldil L%dcache_stride, %r1 +88: ldil L%dcache_stride, %r1 ldw R%dcache_stride(%r1), %r23 ldo -1(%r23), %r21 ANDCM %r26, %r21, %r26 @@ -1029,13 +1049,14 @@ ENTRY_CFI(purge_kernel_dcache_range_asm) pdc,m %r23(%r26) sync +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_DCACHE, INSN_NOP) syncdma bv %r0(%r2) nop ENDPROC_CFI(purge_kernel_dcache_range_asm) ENTRY_CFI(flush_user_icache_range_asm) - ldil L%icache_stride, %r1 +88: ldil L%icache_stride, %r1 ldw R%icache_stride(%r1), %r23 ldo -1(%r23), %r21 ANDCM %r26, %r21, %r26 @@ -1043,13 +1064,14 @@ ENTRY_CFI(flush_user_icache_range_asm) 1: cmpb,COND(<<),n %r26, %r25,1b fic,m %r23(%sr3, %r26) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_ICACHE, INSN_NOP) sync bv %r0(%r2) nop ENDPROC_CFI(flush_user_icache_range_asm) ENTRY_CFI(flush_kernel_icache_page) - ldil L%icache_stride, %r1 +88: ldil L%icache_stride, %r1 ldw R%icache_stride(%r1), %r23 #ifdef CONFIG_64BIT @@ -1079,13 +1101,14 @@ ENTRY_CFI(flush_kernel_icache_page) cmpb,COND(<<) %r26, %r25, 1b fic,m %r23(%sr4, %r26) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_ICACHE, INSN_NOP) sync bv %r0(%r2) nop ENDPROC_CFI(flush_kernel_icache_page) ENTRY_CFI(flush_kernel_icache_range_asm) - ldil L%icache_stride, %r1 +88: ldil L%icache_stride, %r1 ldw R%icache_stride(%r1), %r23 ldo -1(%r23), %r21 ANDCM %r26, %r21, %r26 @@ -1093,6 +1116,7 @@ ENTRY_CFI(flush_kernel_icache_range_asm) 1: cmpb,COND(<<),n %r26, %r25, 1b fic,m %r23(%sr4, %r26) +89: ALTERNATIVE(88b, 89b, ALT_COND_NO_ICACHE, INSN_NOP) sync bv %r0(%r2) nop diff --git a/arch/parisc/kernel/setup.c b/arch/parisc/kernel/setup.c index 4e87c35c22b7..b18c6accfbf9 100644 --- a/arch/parisc/kernel/setup.c +++ b/arch/parisc/kernel/setup.c @@ -305,6 +305,69 @@ static int __init parisc_init_resources(void) return 0; } +static void __init apply_alternatives_all(void) +{ + struct alt_instr *entry; + int index = 0; + + pr_info("alternatives: patching kernel code\n"); + set_kernel_text_rw(1); + + for (entry = (struct alt_instr *) &__alt_instructions; + entry < (struct alt_instr *) &__alt_instructions_end; + entry++, index++) { + + u32 *from, len, cond, replacement; + + from = (u32 *)((ulong)&entry->orig_offset + entry->orig_offset); + len = entry->len; + cond = entry->cond; + replacement = entry->replacement; + + WARN_ON(!cond); + pr_info("Check %d: Cond 0x%x, Replace %02d instructions @ 0x%px with 0x%08x\n", + index, cond, len, from, replacement); + + if ((cond & ALT_COND_NO_SMP) && (num_online_cpus() != 1)) + continue; + if ((cond & ALT_COND_NO_DCACHE) && (cache_info.dc_size != 0)) + continue; + if ((cond & ALT_COND_NO_ICACHE) && (cache_info.ic_size != 0)) + continue; + + /* + * If the PDC_MODEL capabilities has Non-coherent IO-PDIR bit + * set (bit #61, big endian), we have to flush and sync every + * time IO-PDIR is changed in Ike/Astro. + */ + if ((cond & ALT_COND_NO_IOC_FDC) && + (boot_cpu_data.pdc.capabilities & PDC_MODEL_IOPDIR_FDC)) + continue; + + /* Want to replace pdtlb by a pdtlb,l instruction? */ + if (replacement == INSN_PxTLB) { + replacement = *from; + if (boot_cpu_data.cpu_type >= pcxu) /* >= pa2.0 ? */ + replacement |= (1 << 10); /* set el bit */ + } + + /* + * Replace instruction with NOPs? + * For long distance insert a branch instruction instead. + */ + if (replacement == INSN_NOP && len > 1) + replacement = 0xe8000002 + (len-2)*8; /* "b,n .+8" */ + + pr_info("Do %d: Cond 0x%x, Replace %02d instructions @ 0x%px with 0x%08x\n", + index, cond, len, from, replacement); + + /* Replace instruction */ + *from = replacement; + } + + set_kernel_text_rw(0); +} + extern void gsc_init(void); extern void processor_init(void); extern void ccio_init(void); @@ -346,6 +409,7 @@ static int __init parisc_init(void) boot_cpu_data.cpu_hz / 1000000, boot_cpu_data.cpu_hz % 1000000 ); + apply_alternatives_all(); parisc_setup_cache_timing(); /* These are in a non-obvious order, will fix when we have an iotree */ diff --git a/arch/parisc/kernel/signal.c b/arch/parisc/kernel/signal.c index 342073f44d3f..848c1934680b 100644 --- a/arch/parisc/kernel/signal.c +++ b/arch/parisc/kernel/signal.c @@ -65,7 +65,6 @@ #define INSN_LDI_R25_1 0x34190002 /* ldi 1,%r25 (in_syscall=1) */ #define INSN_LDI_R20 0x3414015a /* ldi __NR_rt_sigreturn,%r20 */ #define INSN_BLE_SR2_R0 0xe4008200 /* be,l 0x100(%sr2,%r0),%sr0,%r31 */ -#define INSN_NOP 0x08000240 /* nop */ /* For debugging */ #define INSN_DIE_HORRIBLY 0x68000ccc /* stw %r0,0x666(%sr0,%r0) */ diff --git a/arch/parisc/kernel/vmlinux.lds.S b/arch/parisc/kernel/vmlinux.lds.S index da2e31190efa..ef721fc3671b 100644 --- a/arch/parisc/kernel/vmlinux.lds.S +++ b/arch/parisc/kernel/vmlinux.lds.S @@ -25,7 +25,7 @@ #include #include #include - + /* ld script to make hppa Linux kernel */ #ifndef CONFIG_64BIT OUTPUT_FORMAT("elf32-hppa-linux") @@ -61,6 +61,12 @@ SECTIONS EXIT_DATA } PERCPU_SECTION(8) + . = ALIGN(4); + .altinstructions : { + __alt_instructions = .; + *(.altinstructions) + __alt_instructions_end = .; + } . = ALIGN(HUGEPAGE_SIZE); __init_end = .; /* freed after init ends here */ diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c index aae9b0d71c1e..e7e626bcd0be 100644 --- a/arch/parisc/mm/init.c +++ b/arch/parisc/mm/init.c @@ -511,6 +511,21 @@ static void __init map_pages(unsigned long start_vaddr, } } +void __init set_kernel_text_rw(int enable_read_write) +{ + unsigned long start = (unsigned long)_stext; + unsigned long end = (unsigned long)_etext; + + map_pages(start, __pa(start), end-start, + PAGE_KERNEL_RWX, enable_read_write ? 1:0); + + /* force the kernel to see the new TLB entries */ + __flush_tlb_range(0, start, end); + + /* dump old cached instructions */ + flush_icache_range(start, end); +} + void __ref free_initmem(void) { unsigned long init_begin = (unsigned long)__init_begin; diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c index 614823617b8b..701a7d6a74d5 100644 --- a/drivers/parisc/ccio-dma.c +++ b/drivers/parisc/ccio-dma.c @@ -609,14 +609,13 @@ ccio_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba, ** PCX-T'? Don't know. (eg C110 or similar K-class) ** ** See PDC_MODEL/option 0/SW_CAP word for "Non-coherent IO-PDIR bit". - ** Hopefully we can patch (NOP) these out at boot time somehow. ** ** "Since PCX-U employs an offset hash that is incompatible with ** the real mode coherence index generation of U2, the PDIR entry ** must be flushed to memory to retain coherence." */ - asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr)); - asm volatile("sync"); + asm_io_fdc(pdir_ptr); + asm_io_sync(); } /** @@ -682,17 +681,14 @@ ccio_mark_invalid(struct ioc *ioc, dma_addr_t iova, size_t byte_cnt) ** FIXME: PCX_W platforms don't need FDC/SYNC. (eg C360) ** PCX-U/U+ do. (eg C200/C240) ** See PDC_MODEL/option 0/SW_CAP for "Non-coherent IO-PDIR bit". - ** - ** Hopefully someone figures out how to patch (NOP) the - ** FDC/SYNC out at boot time. */ - asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr[7])); + asm_io_fdc(pdir_ptr); iovp += IOVP_SIZE; byte_cnt -= IOVP_SIZE; } - asm volatile("sync"); + asm_io_sync(); ccio_clear_io_tlb(ioc, CCIO_IOVP(iova), saved_byte_cnt); } diff --git a/drivers/parisc/sba_iommu.c b/drivers/parisc/sba_iommu.c index 11de0eccf968..c1e599a429af 100644 --- a/drivers/parisc/sba_iommu.c +++ b/drivers/parisc/sba_iommu.c @@ -587,8 +587,7 @@ sba_io_pdir_entry(u64 *pdir_ptr, space_t sid, unsigned long vba, * (bit #61, big endian), we have to flush and sync every time * IO-PDIR is changed in Ike/Astro. */ - if (ioc_needs_fdc) - asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr)); + asm_io_fdc(pdir_ptr); } @@ -641,8 +640,8 @@ sba_mark_invalid(struct ioc *ioc, dma_addr_t iova, size_t byte_cnt) do { /* clear I/O Pdir entry "valid" bit first */ ((u8 *) pdir_ptr)[7] = 0; + asm_io_fdc(pdir_ptr); if (ioc_needs_fdc) { - asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr)); #if 0 entries_per_cacheline = L1_CACHE_SHIFT - 3; #endif @@ -661,8 +660,7 @@ sba_mark_invalid(struct ioc *ioc, dma_addr_t iova, size_t byte_cnt) ** could dump core on HPMC. */ ((u8 *) pdir_ptr)[7] = 0; - if (ioc_needs_fdc) - asm volatile("fdc %%r0(%0)" : : "r" (pdir_ptr)); + asm_io_fdc(pdir_ptr); WRITE_REG( SBA_IOVA(ioc, iovp, 0, 0), ioc->ioc_hpa+IOC_PCOM); } @@ -773,8 +771,7 @@ sba_map_single(struct device *dev, void *addr, size_t size, } /* force FDC ops in io_pdir_entry() to be visible to IOMMU */ - if (ioc_needs_fdc) - asm volatile("sync" : : ); + asm_io_sync(); #ifdef ASSERT_PDIR_SANITY sba_check_pdir(ioc,"Check after sba_map_single()"); @@ -858,8 +855,7 @@ sba_unmap_page(struct device *dev, dma_addr_t iova, size_t size, sba_free_range(ioc, iova, size); /* If fdc's were issued, force fdc's to be visible now */ - if (ioc_needs_fdc) - asm volatile("sync" : : ); + asm_io_sync(); READ_REG(ioc->ioc_hpa+IOC_PCOM); /* flush purges */ #endif /* DELAYED_RESOURCE_CNT == 0 */ @@ -1008,8 +1004,7 @@ sba_map_sg(struct device *dev, struct scatterlist *sglist, int nents, filled = iommu_fill_pdir(ioc, sglist, nents, 0, sba_io_pdir_entry); /* force FDC ops in io_pdir_entry() to be visible to IOMMU */ - if (ioc_needs_fdc) - asm volatile("sync" : : ); + asm_io_sync(); #ifdef ASSERT_PDIR_SANITY if (sba_check_pdir(ioc,"Check after sba_map_sg()"))