From patchwork Wed Jul 6 18:26:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Samuel Damashek X-Patchwork-Id: 9216855 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 90E0F6048B for ; Wed, 6 Jul 2016 18:27:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 85AAB2833B for ; Wed, 6 Jul 2016 18:27:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 791C128343; Wed, 6 Jul 2016 18:27:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B12B02833B for ; Wed, 6 Jul 2016 18:27:54 +0000 (UTC) Received: from localhost ([::1]:35238 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bKrY8-0005AD-04 for patchwork-qemu-devel@patchwork.kernel.org; Wed, 06 Jul 2016 14:27:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60903) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bKrXs-0005A6-2z for qemu-devel@nongnu.org; Wed, 06 Jul 2016 14:27:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bKrXo-0004cW-SI for qemu-devel@nongnu.org; Wed, 06 Jul 2016 14:27:36 -0400 Received: from mail-qk0-x232.google.com ([2607:f8b0:400d:c09::232]:34251) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bKrXo-0004cK-Mr for qemu-devel@nongnu.org; Wed, 06 Jul 2016 14:27:32 -0400 Received: by mail-qk0-x232.google.com with SMTP id t127so322371846qkf.1 for ; Wed, 06 Jul 2016 11:27:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=invincea.com; s=google; h=from:to:cc:subject:date:message-id; bh=3h2pqj1wEY/rd9WhQw/pQNlUhCx+pEX3NB8JQny3AkE=; b=VjSfrYvnGz8EQOrAo4A672n3ko36pfoAy8o6/L5QbYDGPBPvPV0fke8T0y4CaaUZ00 dzehfGtV1U5o9s/75yv9GixmuAXSr4nTSOdacmncsdTpgBL9vCmt84Dx6/HrsvyAygpm oVyNpMT4FPgWylBUQYwh7cUe2uMpF+TUQ8APM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=3h2pqj1wEY/rd9WhQw/pQNlUhCx+pEX3NB8JQny3AkE=; b=IL2aIa9cXT9hcGiYM3Wfdetnfe0S57dLUODsrf9b4mhE2bvcWQW1R1I6V2x5KnxY/o YNA4lrYA1fFdSHiuH7Y/FSi+AANL10iipmZyLyGMxtEbwDU+gX/uyodD4mKRZVB57qq2 EE05hqFMEZrcfZd9U/4lKZkheP8WvOC6f6N4kEDY47DbZ7EBjCAHskptE/62QY6tJFtp new59hjYWg64lIimNwMNdbB5PJi4Yg8QGjzXKOZkfnvVYZOP8TCUZoq/+R6Xhxj5ufis tPALhKuJAYX0YigXW9GhmFLPjEt6+5++1/YksK7XSBkhSukMHYoECoSOCOI4nUt4MuWU kRRQ== X-Gm-Message-State: ALyK8tLN3R0WwkJqi78J1WpkNXazryIODAj1RIUwP4VEgVh/cIu1D7INrbdBzueyWKEa8UUW X-Received: by 10.55.156.7 with SMTP id f7mr32456996qke.123.1467829651852; Wed, 06 Jul 2016 11:27:31 -0700 (PDT) Received: from vigilance.labs ([162.17.208.26]) by smtp.gmail.com with ESMTPSA id s62sm639021qkf.12.2016.07.06.11.27.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Jul 2016 11:27:31 -0700 (PDT) From: Samuel Damashek To: qemu-devel@nongnu.org Date: Wed, 6 Jul 2016 14:26:52 -0400 Message-Id: <20160706182652.16190-1-samuel.damashek@invincea.com> X-Mailer: git-send-email 2.9.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400d:c09::232 Subject: [Qemu-devel] [PATCH] Fix for self-modifying writes across page boundaries. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paolo Bonzini , Richard Henderson , Samuel Damashek , Peter Crosthwaite Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP As it currently stands, QEMU does not properly handle self-modifying code when the write is unaligned and crosses a page boundary. The procedure for handling a write to the current translation block is to write-protect the current translation block, catch the write, split up the translation block into the current instruction (which remains write-protected so that the current instruction is not modified) and the remaining instructions in the translation block, and then restore the CPU state to before the write occurred so the write will be retried and successfully executed. However, since unaligned writes across pages are split into one-byte writes for simplicity, writes to the second page (which is not the current TB) may succeed before a write to the current TB is attempted, and since these writes are not invalidated before resuming state after splitting the TB, these writes will be performed a second time, thus corrupting the second page. Credit goes to Patrick Hulin for discovering this. In recent 64-bit versions of Windows running in emulated mode, this results in either being very unstable (a BSOD after a couple minutes of uptime), or being entirely unable to boot. Windows performs one or more 8-byte unaligned self-modifying writes (xors) which intersect the end of the current TB and the beginning of the next TB, which runs into the aforementioned issue. This commit fixes that issue by making the unaligned write loop perform the writes in forwards order, instead of reverse order. This way, QEMU immediately tries to write to the current TB, and splits the TB before any write to the second page is executed. The write then proceeds as intended. With this patch applied, I am able to boot and use Windows 7 64-bit and Windows 10 64-bit in QEMU without KVM. Per Richard Henderson's input, this patch also ensures the second page is in the TLB before executing the write loop, to ensure the second page is mapped. The VICTIM_TLB_HIT macro was also updated to accept a second argument, the name of the addr variable being used, so that I could add an additional addr variable (addr_page2) instead of temporarily setting addr to the second page's address. The original discussion of the issue is located at http://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg02161.html. Signed-off-by: Samuel Damashek --- softmmu_template.h | 61 ++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 46 insertions(+), 15 deletions(-) diff --git a/softmmu_template.h b/softmmu_template.h index 208f808..0d173a0 100644 --- a/softmmu_template.h +++ b/softmmu_template.h @@ -117,7 +117,7 @@ #endif /* macro to check the victim tlb */ -#define VICTIM_TLB_HIT(ty) \ +#define VICTIM_TLB_HIT(ty, addr_nm) \ ({ \ /* we are about to do a page table walk. our last hope is the \ * victim tlb. try to refill from the victim tlb before walking the \ @@ -126,7 +126,8 @@ CPUIOTLBEntry tmpiotlb; \ CPUTLBEntry tmptlb; \ for (vidx = CPU_VTLB_SIZE-1; vidx >= 0; --vidx) { \ - if (env->tlb_v_table[mmu_idx][vidx].ty == (addr & TARGET_PAGE_MASK)) {\ + if (env->tlb_v_table[mmu_idx][vidx].ty \ + == (addr_nm & TARGET_PAGE_MASK)) { \ /* found entry in victim tlb, swap tlb and iotlb */ \ tmptlb = env->tlb_table[mmu_idx][index]; \ env->tlb_table[mmu_idx][index] = env->tlb_v_table[mmu_idx][vidx]; \ @@ -185,7 +186,7 @@ WORD_TYPE helper_le_ld_name(CPUArchState *env, target_ulong addr, cpu_unaligned_access(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr); } - if (!VICTIM_TLB_HIT(ADDR_READ)) { + if (!VICTIM_TLB_HIT(ADDR_READ, addr)) { tlb_fill(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr); } @@ -269,7 +270,7 @@ WORD_TYPE helper_be_ld_name(CPUArchState *env, target_ulong addr, cpu_unaligned_access(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr); } - if (!VICTIM_TLB_HIT(ADDR_READ)) { + if (!VICTIM_TLB_HIT(ADDR_READ, addr)) { tlb_fill(ENV_GET_CPU(env), addr, READ_ACCESS_TYPE, mmu_idx, retaddr); } @@ -389,7 +390,7 @@ void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val, cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } - if (!VICTIM_TLB_HIT(addr_write)) { + if (!VICTIM_TLB_HIT(addr_write, addr)) { tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } tlb_addr = env->tlb_table[mmu_idx][index].addr_write; @@ -414,16 +415,31 @@ void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val, if (DATA_SIZE > 1 && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1 >= TARGET_PAGE_SIZE)) { - int i; + int i, index_page2; + target_ulong addr_page2, tlb_addr_page2; do_unaligned_access: if ((get_memop(oi) & MO_AMASK) == MO_ALIGN) { cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } + + /* Ensure the second page is in the TLB. Note that the first page + * is already guaranteed to be filled into the TLB. */ + addr_page2 = (addr + DATA_SIZE) & TARGET_PAGE_MASK; + index_page2 = (addr_page2 >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); + tlb_addr_page2 = env->tlb_table[mmu_idx][index_page2].addr_write; + + if (((addr_page2 & TARGET_PAGE_MASK) + != (tlb_addr_page2 & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) + && !VICTIM_TLB_HIT(addr_write, addr_page2)) { + tlb_fill(ENV_GET_CPU(env), addr_page2, MMU_DATA_STORE, + mmu_idx, retaddr); + } + /* XXX: not efficient, but simple */ - /* Note: relies on the fact that tlb_fill() does not remove the - * previous page from the TLB cache. */ - for (i = DATA_SIZE - 1; i >= 0; i--) { + /* This loop must go in the forward direction to avoid issues with + * self-modifying code in Windows 64-bit. */ + for (i = 0; i < DATA_SIZE; i++) { /* Little-endian extract. */ uint8_t val8 = val >> (i * 8); /* Note the adjustment at the beginning of the function. @@ -469,7 +485,7 @@ void helper_be_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val, cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } - if (!VICTIM_TLB_HIT(addr_write)) { + if (!VICTIM_TLB_HIT(addr_write, addr)) { tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } tlb_addr = env->tlb_table[mmu_idx][index].addr_write; @@ -494,16 +510,31 @@ void helper_be_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val, if (DATA_SIZE > 1 && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1 >= TARGET_PAGE_SIZE)) { - int i; + int i, index_page2; + target_ulong addr_page2, tlb_addr_page2; do_unaligned_access: if ((get_memop(oi) & MO_AMASK) == MO_ALIGN) { cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } + + /* Ensure the second page is in the TLB. Note that the first page + * is already guaranteed to be filled into the TLB. */ + addr_page2 = (addr + DATA_SIZE) & TARGET_PAGE_MASK; + index_page2 = (addr_page2 >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); + tlb_addr_page2 = env->tlb_table[mmu_idx][index_page2].addr_write; + + if (((addr_page2 & TARGET_PAGE_MASK) + != (tlb_addr_page2 & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) + && !VICTIM_TLB_HIT(addr_write, addr_page2)) { + tlb_fill(ENV_GET_CPU(env), addr_page2, MMU_DATA_STORE, + mmu_idx, retaddr); + } + /* XXX: not efficient, but simple */ - /* Note: relies on the fact that tlb_fill() does not remove the - * previous page from the TLB cache. */ - for (i = DATA_SIZE - 1; i >= 0; i--) { + /* This loop must go in the forward direction to avoid issues with + * self-modifying code in Windows 64-bit. */ + for (i = 0; i < DATA_SIZE; i++) { /* Big-endian extract. */ uint8_t val8 = val >> (((DATA_SIZE - 1) * 8) - (i * 8)); /* Note the adjustment at the beginning of the function. @@ -542,7 +573,7 @@ void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx, if ((addr & TARGET_PAGE_MASK) != (tlb_addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) { /* TLB entry is for a different page */ - if (!VICTIM_TLB_HIT(addr_write)) { + if (!VICTIM_TLB_HIT(addr_write, addr)) { tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr); } }