From patchwork Tue Aug 9 23:12:46 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pranith Kumar X-Patchwork-Id: 9272323 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DDEE26022E for ; Tue, 9 Aug 2016 23:13:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B8BC727F80 for ; Tue, 9 Aug 2016 23:13:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8E0F28358; Tue, 9 Aug 2016 23:13:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 23D2D27F80 for ; Tue, 9 Aug 2016 23:13:20 +0000 (UTC) Received: from localhost ([::1]:38429 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bXGD1-00034A-8x for patchwork-qemu-devel@patchwork.kernel.org; Tue, 09 Aug 2016 19:13:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52549) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bXGCa-0002qO-EA for qemu-devel@nongnu.org; Tue, 09 Aug 2016 19:12:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bXGCX-0006Cq-5e for qemu-devel@nongnu.org; Tue, 09 Aug 2016 19:12:52 -0400 Received: from mail-yb0-x243.google.com ([2607:f8b0:4002:c09::243]:34524) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bXGCX-0006Ck-0S for qemu-devel@nongnu.org; Tue, 09 Aug 2016 19:12:49 -0400 Received: by mail-yb0-x243.google.com with SMTP id x196so669041ybe.1 for ; Tue, 09 Aug 2016 16:12:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=G9iVb3nPnQCPnzlMrW6/pKqKSH4aNLCf6m3rvKHvie8=; b=R8gV+Os62dIVd/rjw7sP3Etwbv10GuzQryjEMb3IQ6wpXz8HoxpvY8jc/ZmCC01ZNd xTmPwV7Sj9jhBhkDh5dWnbk5AO7rElqFZlgXkMIR4WT8zrCzDoWbDjUrCXKHy9hPn3Hi 9ocqLq5uOtER7vdRvEgV63T9kgogn9b+xyiAI8oV/U9FQIkZiqnnfUhMBISZfnTcfYfI AW7FzGc1MZuffh5QF0h+wOfBK6XAI9Ib5dL5Qh/n8bA4rSpYoBF0HkAPdke9Gl2aUEWd y+QCKB7484USnDhc+6YxAehwWGmQW8zi68LwEwo3IBLh1v4b5C9v0PVkdg9oLZG+vTjp Ouvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=G9iVb3nPnQCPnzlMrW6/pKqKSH4aNLCf6m3rvKHvie8=; b=QMpybu2Ysn/cJOefMD1Ohv8QDAOXHCLLNPpYNi9aCfEv9zTi3/VY/OEcF79SEktzHO moDyVGy2cPhmaqFg13JUG2a0xiJqdxRWuDGF+lBHCtdXNnqmh+PgF7X3Myqc+fRWk8R/ bOTWB1Ehkc/yDSlXwn6iL2JIELkxgUs4ZdZFneDnOiyh/Xi9aqAVFQdD2QgmdL203MqJ /nHOTnbRH01C4jfQXYP43hS16R+IIYujbqjFR2Gx+EjHHPcJ33Ud8YR/3MqERZIwTZCE ckODugCedgf/WglM8Mxaf/EW9w9jzWNu5pp5ZTL5ogBvik33ypqKBOM7at4aanMLaSev Ymyg== X-Gm-Message-State: AEkoouu7rXyhF35p85SI1IShXGSNRLgPM3cNErqMUTwWyBgyIvOluDznxEd12zYxu4QDcA== X-Received: by 10.37.165.5 with SMTP id h5mr568704ybi.181.1470784367717; Tue, 09 Aug 2016 16:12:47 -0700 (PDT) Received: from evgadesktop.attlocal.net (108-232-152-155.lightspeed.tukrga.sbcglobal.net. [108.232.152.155]) by smtp.gmail.com with ESMTPSA id q82sm17146595ywg.5.2016.08.09.16.12.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 09 Aug 2016 16:12:47 -0700 (PDT) From: Pranith Kumar To: Richard Henderson , qemu-devel@nongnu.org (open list:All patches CC here) Date: Tue, 9 Aug 2016 19:12:46 -0400 Message-Id: <20160809231246.4537-1-bobby.prani@gmail.com> X-Mailer: git-send-email 2.9.2 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:4002:c09::243 Subject: [Qemu-devel] [RFC v2 PATCH] tcg: Optimize fence instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: alex.bennee@linaro.org, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This commit optimizes fence instructions. Two optimizations are currently implemented. These are: 1. Unnecessary duplicate fence instructions If the same fence instruction is detected consecutively, we remove one instance of it. ex: mb; mb => mb, strl; strl => strl 2. Merging weaker fence with subsequent/previous stronger fence load-acquire/store-release fence can be combined with a full fence without relaxing the ordering constraint. ex: a) ld; ldaq; mb => ld; mb b) mb; strl; st => mb; st Signed-off-by: Pranith Kumar --- v2: - Properly remove current op - Reset only when you encounter memory operations or end of block - Review comments from v1 tcg/optimize.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ tcg/tcg.c | 4 +++ tcg/tcg.h | 1 + 3 files changed, 89 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index cffe89b..5963a39 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -538,6 +538,90 @@ static bool swap_commutative2(TCGArg *p1, TCGArg *p2) return false; } +/* Eliminate duplicate and unnecessary fence instructions */ +void tcg_optimize_mb(TCGContext *s) +{ + int oi, oi_next; + TCGArg prev_op_mb = -1; + TCGOp *prev_op = NULL; + + for (oi = s->gen_op_buf[0].next; oi != 0; oi = oi_next) { + TCGOp *op = &s->gen_op_buf[oi]; + TCGArg *args = &s->gen_opparam_buf[op->args]; + TCGOpcode opc = op->opc; + + switch (opc) { + case INDEX_op_mb: + { + TCGBar curr_mb_type = args[0] & 0xF0; + TCGBar prev_mb_type = prev_op_mb & 0xF0; + + if (curr_mb_type == prev_mb_type || + (curr_mb_type == TCG_BAR_STRL && prev_mb_type == TCG_BAR_SC)) { + /* Remove the current weaker barrier op. The previous + * barrier is stronger and sufficient. + * mb; strl => mb; st + */ + tcg_op_remove(s, op); + op = prev_op; + break; + } else if (curr_mb_type == TCG_BAR_SC && + prev_mb_type == TCG_BAR_LDAQ) { + /* Remove the previous weaker barrier op. The current + * barrier is stronger and sufficient. + * ldaq; mb => ld; mb + */ + tcg_op_remove(s, prev_op); + } else if (curr_mb_type == TCG_BAR_STRL && + prev_mb_type == TCG_BAR_LDAQ) { + /* Consecutive load-acquire and store-release barriers + * can be merged into one stronger SC barrier + * ldaq; strl => ld; mb; st + */ + args[0] = (args[0] & 0x0F) | TCG_BAR_SC; + tcg_op_remove(s, prev_op); + } + prev_op_mb = args[0]; + prev_op = op; + break; + } + case INDEX_op_insn_start: + break; + case INDEX_op_ld8u_i32: + case INDEX_op_ld8u_i64: + case INDEX_op_ld8s_i32: + case INDEX_op_ld8s_i64: + case INDEX_op_ld16u_i32: + case INDEX_op_ld16u_i64: + case INDEX_op_ld16s_i32: + case INDEX_op_ld16s_i64: + case INDEX_op_ld_i32: + case INDEX_op_ld32u_i64: + case INDEX_op_ld32s_i64: + case INDEX_op_ld_i64: + case INDEX_op_st8_i32: + case INDEX_op_st8_i64: + case INDEX_op_st16_i32: + case INDEX_op_st16_i64: + case INDEX_op_st_i32: + case INDEX_op_st32_i64: + case INDEX_op_st_i64: + case INDEX_op_call: + prev_op_mb = -1; + prev_op = NULL; + break; + default: + if (tcg_op_defs[opc].flags & TCG_OPF_BB_END) { + prev_op_mb = -1; + prev_op = NULL; + } + break; + } + + oi_next = op->next; + } +} + /* Propagate constants and copies, fold constant expressions. */ void tcg_optimize(TCGContext *s) { diff --git a/tcg/tcg.c b/tcg/tcg.c index 42417bd..1db319e 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -2587,6 +2587,10 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb) } } +#ifdef USE_TCG_OPTIMIZATIONS + tcg_optimize_mb(s); +#endif + #ifdef CONFIG_PROFILER s->la_time += profile_getclock(); #endif diff --git a/tcg/tcg.h b/tcg/tcg.h index 9ed78dc..79bb5bb 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -921,6 +921,7 @@ void tcg_op_remove(TCGContext *s, TCGOp *op); TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *op, TCGOpcode opc, int narg); TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *op, TCGOpcode opc, int narg); +void tcg_optimize_mb(TCGContext *s); void tcg_optimize(TCGContext *s); /* only used for debugging purposes */