From patchwork Sun Sep 16 23:30:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John David Anglin X-Patchwork-Id: 10601905 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3BEFA6CB for ; Sun, 16 Sep 2018 23:30:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BD9029111 for ; Sun, 16 Sep 2018 23:30:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0FA6929120; Sun, 16 Sep 2018 23:30:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 22A5829111 for ; Sun, 16 Sep 2018 23:30:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725844AbeIQEyv (ORCPT ); Mon, 17 Sep 2018 00:54:51 -0400 Received: from belmont79srvr.owm.bell.net ([184.150.200.79]:33232 "EHLO mtlfep01.bell.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725760AbeIQEyv (ORCPT ); Mon, 17 Sep 2018 00:54:51 -0400 Received: from bell.net mtlfep01 184.150.200.30 by mtlfep01.bell.net with ESMTP id <20180916233012.XGZQ17669.mtlfep01.bell.net@mtlspm02.bell.net> for ; Sun, 16 Sep 2018 19:30:12 -0400 Received: from mx3210.localdomain ([70.53.62.196]) by mtlspm02.bell.net with ESMTP id <20180916233012.VITE6415.mtlspm02.bell.net@mx3210.localdomain>; Sun, 16 Sep 2018 19:30:12 -0400 Received: by mx3210.localdomain (Postfix, from userid 1000) id 9A11A2200ED; Sun, 16 Sep 2018 19:30:11 -0400 (EDT) Date: Sun, 16 Sep 2018 19:30:11 -0400 From: John David Anglin To: linux-parisc@vger.kernel.org Cc: dave.anglin@bell.net, deller@gmx.de, James.Bottomley@HansenPartnership.com Subject: [PATCH] parisc: Remove locking from TLB handler and set page accessed Message-ID: <20180916233010.GA5591@mx3210.localdomain> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-Cloudmark-Analysis: v=2.2 cv=IaFoYSia c=1 sm=0 tr=0 a=Zvhif4XNTjWcJyJCfFIh1A==:17 a=JBFolyDoGHsA:10 a=Ary2aLUOAAAA:8 a=FBHGMhGWAAAA:8 a=c8QHnj19L_-YMPNtLXYA:9 a=CjuIK1q_8ugA:10 a=LN8RNYkQ6tvtt5WQrFYA:9 a=ONNS8QRKHyMA:10 a=PL-35oXrhsPZ3ONe0Ox1:22 a=9gvnlMMaQFpL9xblJ6ne:22 Sender: linux-parisc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Locking was introduced into the TLB handlers primarily to ensure that updates to the accessed and dirty bits in PTE entries were atomic. However, this has a significant impact on performance. Also, it turned out that the user page defines in pgtable.h always set the accessed bit, so it wasn't necessary to update the accessed bit at all. The impact of this is that the kernel doesn't know if a page has been accessed or not. This patch removes the locking from the TLB handlers. We update the page accessed bit in the page table entries using a stb instruction that doesn't affect the dirty bit. Thus, an access can't clobber the dirty bit. With this change, the accessed bit in user PTE entries is now updated when a page is accessed and hopefully this will improve performance. Signed-off-by: John David Anglin diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h index fa6b7c78f18a..7efce946ba04 100644 --- a/arch/parisc/include/asm/pgtable.h +++ b/arch/parisc/include/asm/pgtable.h @@ -202,7 +202,7 @@ static inline void purge_tlb_entries(struct mm_struct *mm, unsigned long addr) #define _PAGE_HUGE (1 << xlate_pabit(_PAGE_HPAGE_BIT)) #define _PAGE_USER (1 << xlate_pabit(_PAGE_USER_BIT)) -#define _PAGE_TABLE (_PAGE_PRESENT | _PAGE_READ | _PAGE_WRITE | _PAGE_DIRTY | _PAGE_ACCESSED) +#define _PAGE_TABLE (_PAGE_PRESENT | _PAGE_READ | _PAGE_WRITE | _PAGE_DIRTY | _PAGE_ACCESSED) #define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY) #define _PAGE_KERNEL_RO (_PAGE_PRESENT | _PAGE_READ | _PAGE_DIRTY | _PAGE_ACCESSED) #define _PAGE_KERNEL_EXEC (_PAGE_KERNEL_RO | _PAGE_EXEC) @@ -227,22 +227,22 @@ static inline void purge_tlb_entries(struct mm_struct *mm, unsigned long addr) #ifndef __ASSEMBLY__ -#define PAGE_NONE __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_ACCESSED) -#define PAGE_SHARED __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_WRITE | _PAGE_ACCESSED) +#define PAGE_NONE __pgprot(_PAGE_PRESENT | _PAGE_USER) +#define PAGE_SHARED __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_WRITE) /* Others seem to make this executable, I don't know if that's correct or not. The stack is mapped this way though so this is necessary in the short term - dhd@linuxcare.com, 2000-08-08 */ -#define PAGE_READONLY __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_ACCESSED) -#define PAGE_WRITEONLY __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_WRITE | _PAGE_ACCESSED) -#define PAGE_EXECREAD __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_EXEC |_PAGE_ACCESSED) +#define PAGE_READONLY __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ) +#define PAGE_WRITEONLY __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_WRITE) +#define PAGE_EXECREAD __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_EXEC) #define PAGE_COPY PAGE_EXECREAD -#define PAGE_RWX __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_WRITE | _PAGE_EXEC |_PAGE_ACCESSED) +#define PAGE_RWX __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_READ | _PAGE_WRITE | _PAGE_EXEC) #define PAGE_KERNEL __pgprot(_PAGE_KERNEL) #define PAGE_KERNEL_EXEC __pgprot(_PAGE_KERNEL_EXEC) #define PAGE_KERNEL_RWX __pgprot(_PAGE_KERNEL_RWX) #define PAGE_KERNEL_RO __pgprot(_PAGE_KERNEL_RO) #define PAGE_KERNEL_UNC __pgprot(_PAGE_KERNEL | _PAGE_NO_CACHE) -#define PAGE_GATEWAY __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_ACCESSED | _PAGE_GATEWAY| _PAGE_READ) +#define PAGE_GATEWAY __pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_GATEWAY| _PAGE_READ) /* diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S index 1b4732e20137..8039241b669b 100644 --- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -46,16 +46,6 @@ .level 2.0 #endif - .import pa_tlb_lock,data - .macro load_pa_tlb_lock reg -#if __PA_LDCW_ALIGNMENT > 4 - load32 PA(pa_tlb_lock) + __PA_LDCW_ALIGNMENT-1, \reg - depi 0,31,__PA_LDCW_ALIGN_ORDER, \reg -#else - load32 PA(pa_tlb_lock), \reg -#endif - .endm - /* space_to_prot macro creates a prot id from a space id */ #if (SPACEID_SHIFT) == 0 @@ -462,47 +452,17 @@ L2_ptep \pgd,\pte,\index,\va,\fault .endm - /* Acquire pa_tlb_lock lock and recheck page is still present. */ - .macro tlb_lock spc,ptp,pte,tmp,tmp1,fault -#ifdef CONFIG_SMP - cmpib,COND(=),n 0,\spc,2f - load_pa_tlb_lock \tmp -1: LDCW 0(\tmp),\tmp1 - cmpib,COND(=) 0,\tmp1,1b - nop - LDREG 0(\ptp),\pte - bb,<,n \pte,_PAGE_PRESENT_BIT,2f - b \fault - stw \spc,0(\tmp) -2: -#endif - .endm - - /* Release pa_tlb_lock lock without reloading lock address. */ - .macro tlb_unlock0 spc,tmp -#ifdef CONFIG_SMP - or,COND(=) %r0,\spc,%r0 - sync - or,COND(=) %r0,\spc,%r0 - stw \spc,0(\tmp) -#endif - .endm - - /* Release pa_tlb_lock lock. */ - .macro tlb_unlock1 spc,tmp -#ifdef CONFIG_SMP - load_pa_tlb_lock \tmp - tlb_unlock0 \spc,\tmp -#endif - .endm - - /* Set the _PAGE_ACCESSED bit of the PTE. Be clever and - * don't needlessly dirty the cache line if it was already set */ + /* Set the _PAGE_ACCESSED bit of the PTE without overwriting the + * dirty bit. Fortunately, it resides in a different byte of + * the PTE than the dirty bit. Be clever and don't needlessly + * dirty the cache line if it was already set */ .macro update_accessed ptp,pte,tmp,tmp1 ldi _PAGE_ACCESSED,\tmp1 or \tmp1,\pte,\tmp + /* Extract aligned byte from PTE containing page accessed bit */ + extru \tmp,_PAGE_ACCESSED_BIT,8,\tmp and,COND(<>) \tmp1,\pte,%r0 - STREG \tmp,0(\ptp) + stb \tmp,__BITS_PER_LONG/8-2(\tmp) .endm /* Set the dirty bit (and accessed bit). No need to be @@ -1167,14 +1127,12 @@ dtlb_miss_20w: L3_ptep ptp,pte,t0,va,dtlb_check_alias_20w - tlb_lock spc,ptp,pte,t0,t1,dtlb_check_alias_20w update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 idtlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1193,14 +1151,12 @@ nadtlb_miss_20w: L3_ptep ptp,pte,t0,va,nadtlb_check_alias_20w - tlb_lock spc,ptp,pte,t0,t1,nadtlb_check_alias_20w update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 idtlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1221,7 +1177,6 @@ dtlb_miss_11: L2_ptep ptp,pte,t0,va,dtlb_check_alias_11 - tlb_lock spc,ptp,pte,t0,t1,dtlb_check_alias_11 update_accessed ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1234,7 +1189,6 @@ dtlb_miss_11: mtsp t1, %sr1 /* Restore sr1 */ - tlb_unlock1 spc,t0 rfir nop @@ -1254,7 +1208,6 @@ nadtlb_miss_11: L2_ptep ptp,pte,t0,va,nadtlb_check_alias_11 - tlb_lock spc,ptp,pte,t0,t1,nadtlb_check_alias_11 update_accessed ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1267,7 +1220,6 @@ nadtlb_miss_11: mtsp t1, %sr1 /* Restore sr1 */ - tlb_unlock1 spc,t0 rfir nop @@ -1287,7 +1239,6 @@ dtlb_miss_20: L2_ptep ptp,pte,t0,va,dtlb_check_alias_20 - tlb_lock spc,ptp,pte,t0,t1,dtlb_check_alias_20 update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 @@ -1296,7 +1247,6 @@ dtlb_miss_20: idtlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1315,7 +1265,6 @@ nadtlb_miss_20: L2_ptep ptp,pte,t0,va,nadtlb_check_alias_20 - tlb_lock spc,ptp,pte,t0,t1,nadtlb_check_alias_20 update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 @@ -1324,7 +1273,6 @@ nadtlb_miss_20: idtlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1424,14 +1372,12 @@ itlb_miss_20w: L3_ptep ptp,pte,t0,va,itlb_fault - tlb_lock spc,ptp,pte,t0,t1,itlb_fault update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 iitlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1448,14 +1394,12 @@ naitlb_miss_20w: L3_ptep ptp,pte,t0,va,naitlb_check_alias_20w - tlb_lock spc,ptp,pte,t0,t1,naitlb_check_alias_20w update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 iitlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1476,7 +1420,6 @@ itlb_miss_11: L2_ptep ptp,pte,t0,va,itlb_fault - tlb_lock spc,ptp,pte,t0,t1,itlb_fault update_accessed ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1489,7 +1432,6 @@ itlb_miss_11: mtsp t1, %sr1 /* Restore sr1 */ - tlb_unlock1 spc,t0 rfir nop @@ -1500,7 +1442,6 @@ naitlb_miss_11: L2_ptep ptp,pte,t0,va,naitlb_check_alias_11 - tlb_lock spc,ptp,pte,t0,t1,naitlb_check_alias_11 update_accessed ptp,pte,t0,t1 make_insert_tlb_11 spc,pte,prot @@ -1513,7 +1454,6 @@ naitlb_miss_11: mtsp t1, %sr1 /* Restore sr1 */ - tlb_unlock1 spc,t0 rfir nop @@ -1534,7 +1474,6 @@ itlb_miss_20: L2_ptep ptp,pte,t0,va,itlb_fault - tlb_lock spc,ptp,pte,t0,t1,itlb_fault update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 @@ -1543,7 +1482,6 @@ itlb_miss_20: iitlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1554,7 +1492,6 @@ naitlb_miss_20: L2_ptep ptp,pte,t0,va,naitlb_check_alias_20 - tlb_lock spc,ptp,pte,t0,t1,naitlb_check_alias_20 update_accessed ptp,pte,t0,t1 make_insert_tlb spc,pte,prot,t1 @@ -1563,7 +1500,6 @@ naitlb_miss_20: iitlbt pte,prot - tlb_unlock1 spc,t0 rfir nop @@ -1586,14 +1522,12 @@ dbit_trap_20w: L3_ptep ptp,pte,t0,va,dbit_fault - tlb_lock spc,ptp,pte,t0,t1,dbit_fault update_dirty ptp,pte,t1 make_insert_tlb spc,pte,prot,t1 idtlbt pte,prot - tlb_unlock0 spc,t0 rfir nop #else @@ -1606,7 +1540,6 @@ dbit_trap_11: L2_ptep ptp,pte,t0,va,dbit_fault - tlb_lock spc,ptp,pte,t0,t1,dbit_fault update_dirty ptp,pte,t1 make_insert_tlb_11 spc,pte,prot @@ -1619,7 +1552,6 @@ dbit_trap_11: mtsp t1, %sr1 /* Restore sr1 */ - tlb_unlock0 spc,t0 rfir nop @@ -1630,7 +1562,6 @@ dbit_trap_20: L2_ptep ptp,pte,t0,va,dbit_fault - tlb_lock spc,ptp,pte,t0,t1,dbit_fault update_dirty ptp,pte,t1 make_insert_tlb spc,pte,prot,t1 @@ -1639,7 +1570,6 @@ dbit_trap_20: idtlbt pte,prot - tlb_unlock0 spc,t0 rfir nop #endif