From patchwork Fri Apr 22 00:06:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 8905011 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 34E899F457 for ; Fri, 22 Apr 2016 00:06:39 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 01127202FF for ; Fri, 22 Apr 2016 00:06:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A9D8E202EC for ; Fri, 22 Apr 2016 00:06:36 +0000 (UTC) Received: from localhost ([::1]:50366 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atOcF-0005sa-TJ for patchwork-qemu-devel@patchwork.kernel.org; Thu, 21 Apr 2016 20:06:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52399) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atOc7-0005ow-Hk for qemu-devel@nongnu.org; Thu, 21 Apr 2016 20:06:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atOc4-0000Fs-B4 for qemu-devel@nongnu.org; Thu, 21 Apr 2016 20:06:27 -0400 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:48148) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atOc4-0000Fo-6e for qemu-devel@nongnu.org; Thu, 21 Apr 2016 20:06:24 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 1D57E221B8 for ; Thu, 21 Apr 2016 20:06:24 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Thu, 21 Apr 2016 20:06:24 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=braap.org; h=cc :date:from:message-id:subject:to:x-sasl-enc:x-sasl-enc; s= mesmtp; bh=au+Gr1TS45Ya4kTDxbAQtq67rGg=; b=pgckq6pImeU8cA3uGgBlP m7UcRWTEtD9prjjboDU+ATJ8a/L+loyivMo50IuVQdhCLQaeDqkLAXjzAsJkZDXV CuQjC2Yxuudv2X3uArnCNwe0LJRVnS4vf2HNIGNpFlpkx17dGXOY7zmwwP/7YiEm zTuJfF+rIZ4S0KupuLTReo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:message-id:subject:to :x-sasl-enc:x-sasl-enc; s=smtpout; bh=au+Gr1TS45Ya4kTDxbAQtq67rG g=; b=VFBr5V48YIcHfu19cyT6APkJv2TknIBGX3PJDCm/XrRqytAqGScZgUgwX0 5Aldr37ZWMmJWrbpkEPqwvb9PAAZeya0Rio9dfdV0/ZPatEz6uvAC0IY2NO8va81 bHkW5a6R6aZmJGiFcb9NjqnBiX2VBB0M4Wt84Fag7fbgOaGZg= X-Sasl-enc: JAcMU8/Dn9cGww+xUbgC+W6iqM+phZ8mNU17ORAAjeNO 1461283583 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id CD79DC00014; Thu, 21 Apr 2016 20:06:23 -0400 (EDT) From: "Emilio G. Cota" To: QEMU Developers , MTTCG Devel Date: Thu, 21 Apr 2016 20:06:23 -0400 Message-Id: <1461283583-2833-1-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.5.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 66.111.4.26 Subject: [Qemu-devel] [RFC] translate-all: protect code_gen_buffer with RCU X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paolo Bonzini , Sergey Fedorov , Richard Henderson , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Peter Crosthwaite Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a first attempt at making tb_flush not have to stop all CPUs. There are issues as pointed out below, but this could be a good start. Context: https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg04658.html https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg06942.html Known issues: - Basically compile-tested only, since I've only run this with single-threaded TCG; I also tried running it with linux-user, but in order to trigger tb_flush I had to make code_gen_buffer so small that the CPU calling tb_flush would immediately fill the 2nd buffer, triggering the assert. If you have a working multi-threaded workload that would be good to test this, please let me know. - Windows; not even compile-tested! Signed-off-by: Emilio G. Cota --- translate-all.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 117 insertions(+), 5 deletions(-) diff --git a/translate-all.c b/translate-all.c index bba9b62..4c14b4d 100644 --- a/translate-all.c +++ b/translate-all.c @@ -536,8 +536,13 @@ static inline void *split_cross_256mb(void *buf1, size_t size1) #endif #ifdef USE_STATIC_CODE_GEN_BUFFER -static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE] +static uint8_t static_code_gen_buffer1[DEFAULT_CODE_GEN_BUFFER_SIZE] __attribute__((aligned(CODE_GEN_ALIGN))); +static uint8_t static_code_gen_buffer2[DEFAULT_CODE_GEN_BUFFER_SIZE] + __attribute__((aligned(CODE_GEN_ALIGN))); +static int static_buf_mask = 1; +static void *static_buf1; +static void *static_buf2; # ifdef _WIN32 static inline void do_protect(void *addr, long size, int prot) @@ -580,13 +585,12 @@ static inline void map_none(void *addr, long size) } # endif /* WIN32 */ -static inline void *alloc_code_gen_buffer(void) +static void *alloc_static_code_gen_buffer(void *buf) { - void *buf = static_code_gen_buffer; size_t full_size, size; /* The size of the buffer, rounded down to end on a page boundary. */ - full_size = (((uintptr_t)buf + sizeof(static_code_gen_buffer)) + full_size = (((uintptr_t)buf + sizeof(static_code_gen_buffer1)) & qemu_real_host_page_mask) - (uintptr_t)buf; /* Reserve a guard page. */ @@ -612,6 +616,15 @@ static inline void *alloc_code_gen_buffer(void) return buf; } + +static inline void *alloc_code_gen_buffer(void) +{ + static_buf1 = alloc_static_code_gen_buffer(static_code_gen_buffer1); + static_buf2 = alloc_static_code_gen_buffer(static_code_gen_buffer2); + + assert(static_buf_mask == 1); + return static_buf1; +} #elif defined(_WIN32) static inline void *alloc_code_gen_buffer(void) { @@ -829,8 +842,100 @@ static void page_flush_tb(void) } } +#ifdef USE_STATIC_CODE_GEN_BUFFER + +struct code_gen_desc { + struct rcu_head rcu; + int clear_bit; +}; + +static void code_gen_buffer_clear(struct rcu_head *rcu) +{ + struct code_gen_desc *desc = container_of(rcu, struct code_gen_desc, rcu); + + tb_lock(); + static_buf_mask &= ~desc->clear_bit; + tb_unlock(); + g_free(desc); +} + +static void *code_gen_buffer_replace(void) +{ + struct code_gen_desc *desc = g_malloc0(sizeof(*desc)); + + /* + * If both bits are set, we're having two concurrent flushes. This + * can easily happen if the buffers are heavily undersized. + */ + assert(static_buf_mask == 1 || static_buf_mask == 2); + + desc->clear_bit = static_buf_mask; + call_rcu1(&desc->rcu, code_gen_buffer_clear); + + if (static_buf_mask == 1) { + static_buf_mask |= 2; + return static_buf2; + } + static_buf_mask |= 1; + return static_buf1; +} + +#elif defined(_WIN32) + +struct code_gen_desc { + struct rcu_head rcu; + void *buf; +}; + +static void code_gen_buffer_vfree(struct rcu_head *rcu) +{ + struct code_gen_desc *desc = container_of(rcu, struct code_gen_desc, rcu); + + VirtualFree(desc->buf, 0, MEM_RELEASE); + g_free(desc); +} + +static void *code_gen_buffer_replace(void) +{ + struct code_gen_desc *desc; + + desc = g_malloc0(sizeof(*desc)); + desc->buf = tcg_ctx.code_gen_buffer; + call_rcu1(&desc->rcu, code_gen_buffer_vfree); + + return alloc_code_gen_buffer(); +} + +#else /* UNIX, dynamically-allocated code buffer */ + +struct code_gen_desc { + struct rcu_head rcu; + void *buf; + size_t size; +}; + +static void code_gen_buffer_unmap(struct rcu_head *rcu) +{ + struct code_gen_desc *desc = container_of(rcu, struct code_gen_desc, rcu); + + munmap(desc->buf, desc->size + qemu_real_host_page_size); + g_free(desc); +} + +static void *code_gen_buffer_replace(void) +{ + struct code_gen_desc *desc; + + desc = g_malloc0(sizeof(*desc)); + desc->buf = tcg_ctx.code_gen_buffer; + desc->size = tcg_ctx.code_gen_buffer_size; + call_rcu1(&desc->rcu, code_gen_buffer_unmap); + + return alloc_code_gen_buffer(); +} +#endif /* USE_STATIC_CODE_GEN_BUFFER */ + /* flush all the translation blocks */ -/* XXX: tb_flush is currently not thread safe */ void tb_flush(CPUState *cpu) { #if defined(DEBUG_FLUSH) @@ -853,10 +958,17 @@ void tb_flush(CPUState *cpu) qht_reset_size(&tcg_ctx.tb_ctx.htable, CODE_GEN_HTABLE_SIZE); page_flush_tb(); + tcg_ctx.code_gen_buffer = code_gen_buffer_replace(); tcg_ctx.code_gen_ptr = tcg_ctx.code_gen_buffer; + tcg_prologue_init(&tcg_ctx); /* XXX: flush processor icache at this point if cache flush is expensive */ tcg_ctx.tb_ctx.tb_flush_count++; + + /* exit all CPUs so that the old buffer is quickly cleared. */ + CPU_FOREACH(cpu) { + cpu_exit(cpu); + } } #ifdef DEBUG_TB_CHECK