From patchwork Tue May 3 16:03:25 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 9006051 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id B7FBE9F1D3 for ; Tue, 3 May 2016 16:05:37 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DA741202E5 for ; Tue, 3 May 2016 16:05:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A8BCC202D1 for ; Tue, 3 May 2016 16:05:29 +0000 (UTC) Received: from localhost ([::1]:41986 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axcpB-00066H-NO for patchwork-qemu-devel@patchwork.kernel.org; Tue, 03 May 2016 12:05:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35793) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axcol-0005ki-Nw for qemu-devel@nongnu.org; Tue, 03 May 2016 12:05:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1axcoZ-0007Yc-97 for qemu-devel@nongnu.org; Tue, 03 May 2016 12:04:54 -0400 Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:48412) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1axcoZ-0007U0-0A for qemu-devel@nongnu.org; Tue, 03 May 2016 12:04:47 -0400 Received: from localhost by e06smtp16.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 3 May 2016 17:04:34 +0100 Received: from d06dlp02.portsmouth.uk.ibm.com (9.149.20.14) by e06smtp16.uk.ibm.com (192.168.101.146) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 3 May 2016 17:04:31 +0100 X-IBM-Helo: d06dlp02.portsmouth.uk.ibm.com X-IBM-MailFrom: clg@kaod.org X-IBM-RcptTo: qemu-devel@nongnu.org;qemu-ppc@nongnu.org Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id 6D597219005F; Tue, 3 May 2016 17:04:07 +0100 (BST) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u43G4USU63570168; Tue, 3 May 2016 16:04:30 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u43G4UAB025734; Tue, 3 May 2016 12:04:30 -0400 Received: from hermes.lab.toulouse-stg.fr.ibm.com (hermes.lab.toulouse-stg.fr.ibm.com [9.101.4.42]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u43G4Kwt025102; Tue, 3 May 2016 12:04:29 -0400 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: David Gibson Date: Tue, 3 May 2016 18:03:25 +0200 Message-Id: <1462291414-8343-4-git-send-email-clg@kaod.org> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1462291414-8343-1-git-send-email-clg@kaod.org> References: <1462291414-8343-1-git-send-email-clg@kaod.org> MIME-Version: 1.0 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16050316-0025-0000-0000-0000141205CE X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 195.75.94.112 Subject: [Qemu-devel] [PATCH 03/12] ppc: Do some batching of TCG tlb flushes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, Cedric Le Goater Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Benjamin Herrenschmidt On ppc64 especially, we flush the tlb on any slbie or tlbie instruction. However, those instructions often come in bursts of 3 or more (context switch will favor a series of slbie's for example to an slbia if the SLB has less than a certain number of entries in it, and tlbie's can happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries at a time. Doing a tlb_flush() each time is a waste of time. We end up doing a memset of the whole TLB, reloading it for the next instruction, memset'ing again, etc... Those instructions don't have to take effect immediately. For slbie, they can wait for the next context synchronizing event. For tlbie, the next tlbsync. This implements batching by keeping a flag that indicates that we have a TLB in need of flushing. We check it on interrupts, rfi's, isync's and tlbsync and flush the TLB if needed. This reduces the number of tlb_flush() on a boot to a ubuntu installer first dialog screen from roughly 360K down to 36K. Signed-off-by: Benjamin Herrenschmidt [clg: added a 'CPUPPCState *' variable in h_remove() and h_bulk_remove() ] Signed-off-by: Cédric Le Goater --- hw/ppc/spapr_hcall.c | 14 +++++++++++--- target-ppc/cpu.h | 2 ++ target-ppc/excp_helper.c | 9 +++++++++ target-ppc/helper.h | 1 + target-ppc/helper_regs.h | 13 +++++++++++++ target-ppc/mmu-hash64.c | 11 +++-------- target-ppc/mmu_helper.c | 9 ++++++++- target-ppc/translate.c | 39 ++++++++++++++++++++++++++++++++++++--- 8 files changed, 83 insertions(+), 15 deletions(-) diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c index 8f40602a5efb..2713087c1e5d 100644 --- a/hw/ppc/spapr_hcall.c +++ b/hw/ppc/spapr_hcall.c @@ -183,6 +183,7 @@ static RemoveResult remove_hpte(PowerPCCPU *cpu, target_ulong ptex, static target_ulong h_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr, target_ulong opcode, target_ulong *args) { + CPUPPCState *env = &cpu->env; target_ulong flags = args[0]; target_ulong pte_index = args[1]; target_ulong avpn = args[2]; @@ -193,6 +194,7 @@ static target_ulong h_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr, switch (ret) { case REMOVE_SUCCESS: + check_tlb_flush(env); return H_SUCCESS; case REMOVE_NOT_FOUND: @@ -229,7 +231,9 @@ static target_ulong h_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr, static target_ulong h_bulk_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr, target_ulong opcode, target_ulong *args) { + CPUPPCState *env = &cpu->env; int i; + target_ulong rc = H_SUCCESS; for (i = 0; i < H_BULK_REMOVE_MAX_BATCH; i++) { target_ulong *tsh = &args[i*2]; @@ -262,14 +266,18 @@ static target_ulong h_bulk_remove(PowerPCCPU *cpu, sPAPRMachineState *spapr, break; case REMOVE_PARM: - return H_PARAMETER; + rc = H_PARAMETER; + goto exit; case REMOVE_HW: - return H_HARDWARE; + rc = H_HARDWARE; + goto exit; } } + exit: + check_tlb_flush(env); - return H_SUCCESS; + return rc; } static target_ulong h_protect(PowerPCCPU *cpu, sPAPRMachineState *spapr, diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index 9588b30ee855..2a96efcbf813 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -1069,6 +1069,8 @@ struct CPUPPCState { /* PowerPC 64 SLB area */ ppc_slb_t slb[MAX_SLB_ENTRIES]; int32_t slb_nr; + /* tcg TLB needs flush (deferred slb inval instruction typically) */ + uint32_t tlb_need_flush; #endif /* segment registers */ hwaddr htab_base; diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c index cf882ebdad4c..85f38640bdf4 100644 --- a/target-ppc/excp_helper.c +++ b/target-ppc/excp_helper.c @@ -717,6 +717,11 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) /* Reset exception state */ cs->exception_index = POWERPC_EXCP_NONE; env->error_code = 0; + + /* Any interrupt is context synchronizing, check if TCG TLB + * needs a delayed flush on ppc64 + */ + check_tlb_flush(env); } void ppc_cpu_do_interrupt(CPUState *cs) @@ -738,6 +743,7 @@ static void ppc_hw_interrupt(CPUPPCState *env) __func__, env, env->pending_interrupts, cs->interrupt_request, (int)msr_me, (int)msr_ee); #endif + /* External reset */ if (env->pending_interrupts & (1 << PPC_INTERRUPT_RESET)) { env->pending_interrupts &= ~(1 << PPC_INTERRUPT_RESET); @@ -942,6 +948,9 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr, * as rfi is always the last insn of a TB */ cs->interrupt_request |= CPU_INTERRUPT_EXITTB; + + /* Context synchronizing: check if TCG TLB needs flush */ + check_tlb_flush(env); } void helper_rfi(CPUPPCState *env) diff --git a/target-ppc/helper.h b/target-ppc/helper.h index e5a8f7b9b539..0526322f4d27 100644 --- a/target-ppc/helper.h +++ b/target-ppc/helper.h @@ -16,6 +16,7 @@ DEF_HELPER_1(rfmci, void, env) DEF_HELPER_1(rfid, void, env) DEF_HELPER_1(hrfid, void, env) #endif +DEF_HELPER_1(check_tlb_flush, void, env) #endif DEF_HELPER_3(lmw, void, env, tl, i32) diff --git a/target-ppc/helper_regs.h b/target-ppc/helper_regs.h index f7edd5bc5945..57da931e3c4d 100644 --- a/target-ppc/helper_regs.h +++ b/target-ppc/helper_regs.h @@ -151,4 +151,17 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value, return excp; } +#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64) +static inline void check_tlb_flush(CPUPPCState *env) +{ + CPUState *cs = CPU(ppc_env_get_cpu(env)); + if (env->tlb_need_flush) { + env->tlb_need_flush = 0; + tlb_flush(cs, 1); + } +} +#else +static inline void check_tlb_flush(CPUPPCState *env) { } +#endif + #endif /* !defined(__HELPER_REGS_H__) */ diff --git a/target-ppc/mmu-hash64.c b/target-ppc/mmu-hash64.c index 72c4ab5d751c..44fc1bfc288c 100644 --- a/target-ppc/mmu-hash64.c +++ b/target-ppc/mmu-hash64.c @@ -98,10 +98,8 @@ void dump_slb(FILE *f, fprintf_function cpu_fprintf, PowerPCCPU *cpu) void helper_slbia(CPUPPCState *env) { - PowerPCCPU *cpu = ppc_env_get_cpu(env); - int n, do_invalidate; + int n; - do_invalidate = 0; /* XXX: Warning: slbia never invalidates the first segment */ for (n = 1; n < env->slb_nr; n++) { ppc_slb_t *slb = &env->slb[n]; @@ -112,12 +110,9 @@ void helper_slbia(CPUPPCState *env) * and we still don't have a tlb_flush_mask(env, n, mask) * in QEMU, we just invalidate all TLBs */ - do_invalidate = 1; + env->tlb_need_flush = true; } } - if (do_invalidate) { - tlb_flush(CPU(cpu), 1); - } } void helper_slbie(CPUPPCState *env, target_ulong addr) @@ -137,7 +132,7 @@ void helper_slbie(CPUPPCState *env, target_ulong addr) * and we still don't have a tlb_flush_mask(env, n, mask) * in QEMU, we just invalidate all TLBs */ - tlb_flush(CPU(cpu), 1); + env->tlb_need_flush = true; } } diff --git a/target-ppc/mmu_helper.c b/target-ppc/mmu_helper.c index ff217941b5a7..930e9d31cfde 100644 --- a/target-ppc/mmu_helper.c +++ b/target-ppc/mmu_helper.c @@ -26,6 +26,7 @@ #include "mmu-hash32.h" #include "exec/cpu_ldst.h" #include "exec/log.h" +#include "helper_regs.h" //#define DEBUG_MMU //#define DEBUG_BATS @@ -1923,6 +1924,7 @@ void ppc_tlb_invalidate_all(CPUPPCState *env) case POWERPC_MMU_2_06a: case POWERPC_MMU_2_07: case POWERPC_MMU_2_07a: + env->tlb_need_flush = 0; #endif /* defined(TARGET_PPC64) */ tlb_flush(CPU(cpu), 1); break; @@ -1985,7 +1987,7 @@ void ppc_tlb_invalidate_one(CPUPPCState *env, target_ulong addr) * and we still don't have a tlb_flush_mask(env, n, mask) in QEMU, * we just invalidate all TLBs */ - tlb_flush(CPU(cpu), 1); + env->tlb_need_flush = 1; break; #endif /* defined(TARGET_PPC64) */ default: @@ -2874,6 +2876,11 @@ void helper_booke206_tlbflush(CPUPPCState *env, target_ulong type) } +void helper_check_tlb_flush(CPUPPCState *env) +{ + check_tlb_flush(env); +} + /*****************************************************************************/ /* try to fill the TLB and return an exception if error. If retaddr is diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 1119a301154c..62fabe952c35 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -3312,9 +3312,32 @@ static void gen_eieio(DisasContext *ctx) { } +#if !defined(CONFIG_USER_ONLY) && defined(TARGET_PPC64) +static inline void gen_check_tlb_flush(DisasContext *ctx) +{ + TCGv_i32 t = tcg_temp_new_i32(); + TCGLabel *l = gen_new_label(); + + tcg_gen_ld_i32(t, cpu_env, offsetof(CPUPPCState, tlb_need_flush)); + tcg_gen_brcondi_i32(TCG_COND_EQ, t, 0, l); + gen_helper_check_tlb_flush(cpu_env); + gen_set_label(l); + tcg_temp_free_i32(t); +} +#else +static inline void gen_check_tlb_flush(DisasContext *ctx) { } +#endif + /* isync */ static void gen_isync(DisasContext *ctx) { + /* + * We need to check for a pending TLB flush. This can only happen in + * kernel mode however so check MSR_PR + */ + if (!ctx->pr) { + gen_check_tlb_flush(ctx); + } gen_stop_exception(ctx); } @@ -3471,6 +3494,15 @@ STCX(stqcx_, 16); /* sync */ static void gen_sync(DisasContext *ctx) { + uint32_t l = (ctx->opcode >> 21) & 3; + + /* + * For l == 2, it's a ptesync, We need to check for a pending TLB flush. + * This can only happen in kernel mode however so check MSR_PR as well. + */ + if (l == 2 && !ctx->pr) { + gen_check_tlb_flush(ctx); + } } /* wait */ @@ -4878,10 +4910,11 @@ static void gen_tlbsync(DisasContext *ctx) gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC); return; } - /* This has no effect: it should ensure that all previous - * tlbie have completed + /* tlbsync is a nop for server, ptesync handles delayed tlb flush, + * embedded however needs to deal with tlbsync. We don't try to be + * fancy and swallow the overhead of checking for both. */ - gen_stop_exception(ctx); + gen_check_tlb_flush(ctx); #endif }