From patchwork Wed Jun 12 17:08:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2827A1708 for ; Wed, 12 Jun 2019 17:10:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10D8428A17 for ; Wed, 12 Jun 2019 17:10:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 02FFB28A64; Wed, 12 Jun 2019 17:10:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4596C28A28 for ; Wed, 12 Jun 2019 17:10:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 52F1A6B000D; Wed, 12 Jun 2019 13:10:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4B9376B000E; Wed, 12 Jun 2019 13:10:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 359756B0010; Wed, 12 Jun 2019 13:10:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-vs1-f69.google.com (mail-vs1-f69.google.com [209.85.217.69]) by kanga.kvack.org (Postfix) with ESMTP id 0C9CA6B000D for ; Wed, 12 Jun 2019 13:10:06 -0400 (EDT) Received: by mail-vs1-f69.google.com with SMTP id d139so957673vsc.14 for ; Wed, 12 Jun 2019 10:10:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RGV2nXv0tDBodUuIHbEss/S4uQOCZyMq1lOqu82Edzk=; b=ZWspTScoHIlbTavXpmF/8LDpokQqyVRG4ZAd690CSwywXBpZASmRpe0bsCYOkis4pj ea71fUnni4ZoRehdM1PBePTV3OAhj2DkreMSkaDRnkBD3OLPOAsHxQTlOLq9IDb0LGaQ CsD+k7G5sWt/Rsi3apJj0/FPm3WYtYWor0E7oRWCQgsHXwS0iLEnDUtgIxPd3RE4izyD flk95mMBrHKoUNplukgs3/cytEy5mQ6aNNOSb6KZD9CZGssEwhtpNQXnDp3IuHL9cHmL oMWBiflWtha0RvN7PqtB4RsGsQLfKXnOelsf9Rz6aUg9cfryt/6zkh15HYLBA2FjKnV9 mc0g== X-Gm-Message-State: APjAAAU9B1vsa1oop7JUg+M8h3pbxrSRgKAXPEa/1J/h7c9/nqztWVSp L8+FpcTN7ASZDPdOoBGTD4XdiHyneNJVABrNiKFIq1WOghtyxj1dONEv7ggSn29pJsADgcC+NpU tRZqWrP/060aln4FQwrK4nBurtN4s4qdqUw6VUCvxNkQ2mT4K2VlBG3kevyBcMea/IQ== X-Received: by 2002:a67:1605:: with SMTP id 5mr30245967vsw.26.1560359405706; Wed, 12 Jun 2019 10:10:05 -0700 (PDT) X-Google-Smtp-Source: APXvYqyNE++qx9JHr2xl4F6a60Ga69Uiegj4W3KGOg5T3b3FrpTGmcWTqVn5xVwq8hG7GOrdXNdt X-Received: by 2002:a67:1605:: with SMTP id 5mr30245896vsw.26.1560359405117; Wed, 12 Jun 2019 10:10:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359405; cv=none; d=google.com; s=arc-20160816; b=SJ6EPkLZVa35N7YGfLRnML2xpieF17n7uhkGFO80fxQ8fA+NCz2uIEkCauadxtXuqK QeQUXHmM/KEaaE87sLT2lCF9zx2ce+Q8bUdcvGLFfxTUbB6DqYIVpP1xTMTk9+duORJw qrRUq37qpNrorbIqCm5BG5L3WOHZZELfQwpcshSvrhyhQSvbC3HYWUnKvVxla3aUoCe2 YHmREU169biLlFwlYJPMzRV19t5pQj5svsxH3evLqqFt9/sc3OBy59LkJR/afudqY74V YlW8FdC2ySy9ob1/wYd5KjyWStCXJJEzZfdh6cQRKeCsQ9FJnTj6P+8+Dr4h9jWHNGW4 H/4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=RGV2nXv0tDBodUuIHbEss/S4uQOCZyMq1lOqu82Edzk=; b=Ofpb3Oouvp/I+9sOirXYW/XHs4sA0142PUkyPKJFEj+RzvjA1iaTkzXPojKnxiJeWF VyDVwsQgz1XOqgYt+6/pVbjVz8ZLNJMI/FDP1wLTIP1fnrGGh4Fw/36gH+DAJL+Ko+v3 i9mq2m0HWBUFIqZu3fIKXfuEYxjBcPE6Be2wfCukx4u6jMxSjnG4L6Oeo/xA+47ekPu1 JxvNkD3X/01iWoSDvVX6yMPF9FKE2jLD2QriG7Go+KuV9ZvA/4RyS0OwBdi4clruQ3Mp IxFV4q464Ly7VCieJYY+T2vDgQNw1LoELxoyYQLYnIVAyg2auQ+tKmUvGdaA86wvxzRd 9+Gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=fUzZrGvA; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 52.95.48.154 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com. [52.95.48.154]) by mx.google.com with ESMTPS id a22si76685vsq.180.2019.06.12.10.10.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:10:05 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 52.95.48.154 as permitted sender) client-ip=52.95.48.154; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=fUzZrGvA; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 52.95.48.154 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359405; x=1591895405; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RGV2nXv0tDBodUuIHbEss/S4uQOCZyMq1lOqu82Edzk=; b=fUzZrGvA6Fc5l5xPT60CAit2NcoQyNcptxu+f3h6FWfQFqRjf3sJRRJ0 gmOawV7Nju6Yj2heEQBGIHtzbIry6rit8cIpHHoIfE70p1qYmtzIFKNGP 9uFAtDEaxwFh0qTKR47oggejyyGkxtYWZeOl7yXtEnE42e8bBh7HfqE6Q w=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="400444646" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1d-9ec21598.us-east-1.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP; 12 Jun 2019 17:10:03 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (iad7-ws-svc-lb50-vlan3.amazon.com [10.0.93.214]) by email-inbound-relay-1d-9ec21598.us-east-1.amazon.com (Postfix) with ESMTPS id 43F06A258B; Wed, 12 Jun 2019 17:10:02 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CH9x1s017050; Wed, 12 Jun 2019 19:09:59 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CH9xNm017043; Wed, 12 Jun 2019 19:09:59 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse Subject: [RFC 01/10] x86/mm/kaslr: refactor to use enum indices for regions Date: Wed, 12 Jun 2019 19:08:26 +0200 Message-Id: <20190612170834.14855-2-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The KASLR randomization code currently refers to specific regions, such as the vmalloc area, by literal indices into an array. When adding new regions, we have to be careful to also change all indices that may potentially change. Avoid that risk by introducing an enum used as indices. Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/mm/kaslr.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index 3f452ffed7e9..c455f1ffba29 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -41,6 +41,12 @@ */ static const unsigned long vaddr_end = CPU_ENTRY_AREA_BASE; +enum { + PHYSMAP, + VMALLOC, + VMMEMMAP, +}; + /* * Memory regions randomized by KASLR (except modules that use a separate logic * earlier during boot). The list is ordered based on virtual addresses. This @@ -50,9 +56,9 @@ static __initdata struct kaslr_memory_region { unsigned long *base; unsigned long size_tb; } kaslr_regions[] = { - { &page_offset_base, 0 }, - { &vmalloc_base, 0 }, - { &vmemmap_base, 1 }, + [PHYSMAP] = { &page_offset_base, 0 }, + [VMALLOC] = { &vmalloc_base, 0 }, + [VMMEMMAP] = { &vmemmap_base, 1 }, }; /* Get size in bytes used by the memory region */ @@ -94,20 +100,20 @@ void __init kernel_randomize_memory(void) if (!kaslr_memory_enabled()) return; - kaslr_regions[0].size_tb = 1 << (__PHYSICAL_MASK_SHIFT - TB_SHIFT); - kaslr_regions[1].size_tb = VMALLOC_SIZE_TB; + kaslr_regions[PHYSMAP].size_tb = 1 << (__PHYSICAL_MASK_SHIFT - TB_SHIFT); + kaslr_regions[VMALLOC].size_tb = VMALLOC_SIZE_TB; /* * Update Physical memory mapping to available and * add padding if needed (especially for memory hotplug support). */ - BUG_ON(kaslr_regions[0].base != &page_offset_base); + BUG_ON(kaslr_regions[PHYSMAP].base != &page_offset_base); memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) + CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING; /* Adapt phyiscal memory region size based on available memory */ - if (memory_tb < kaslr_regions[0].size_tb) - kaslr_regions[0].size_tb = memory_tb; + if (memory_tb < kaslr_regions[PHYSMAP].size_tb) + kaslr_regions[PHYSMAP].size_tb = memory_tb; /* Calculate entropy available between regions */ remain_entropy = vaddr_end - vaddr_start; From patchwork Wed Jun 12 17:08:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990417 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D53B13AF for ; Wed, 12 Jun 2019 17:10:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 279862888A for ; Wed, 12 Jun 2019 17:10:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1AB53288C6; Wed, 12 Jun 2019 17:10:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAC182888A for ; Wed, 12 Jun 2019 17:10:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D45506B0010; Wed, 12 Jun 2019 13:10:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CF5DF6B0266; Wed, 12 Jun 2019 13:10:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBD906B0269; Wed, 12 Jun 2019 13:10:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by kanga.kvack.org (Postfix) with ESMTP id 8F9756B0010 for ; Wed, 12 Jun 2019 13:10:43 -0400 (EDT) Received: by mail-vk1-f198.google.com with SMTP id y198so2772736vky.9 for ; Wed, 12 Jun 2019 10:10:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RzVt4uakvvgdusQzsQlIX4JlUcBfNHHcF9nlPs/PUMU=; b=R8hhs8wmbN11LODsXXF7erL7obtUli/wsUVOasOm45eNZoeEfGY0PQpsUJeEDlLs5Z 0StT1H2RTIc5mZ4VlelgQ4/Cm9cefUVYfq/xQuwP/6HrZIq3Eo2Q3nMTwo7hzIm8sLnH 4RhXNWLT1OBN/JbUkj7akCyxgNMCKxay+8wYzwoe/mIpvty6Xb/LpO6d18mALoiUSCdD zsdDyH4qUlvS584l4xTHavl3V9ofSbvygSvZCj6VH4EehOQG5UC6GypsxKyCiNdpAoow OkS8DYqiKBNSVa/1g+rEs8CxPnQFXG816W+W2UgNqVpu96UGwEi23lWD3FD7d46qHq95 mhww== X-Gm-Message-State: APjAAAUSIbDxo0gt3YyRmjgDFWyacMMMSAs1I4jLC6nrZXUIcuhwIjXP 2oGNN1uKhI+nnHkmpNASKBhODZe1NA9BrgkcC7lIO/F3HLv+rir4MDBvSVlI9sy5RU3lzi55dZa rHhyA7HC+vTh7FlkgOZTpT2bQkJvnfPS/4xdiHVcNZmYYPurBrvI1Zfu4rzOG0SqEeg== X-Received: by 2002:a1f:2117:: with SMTP id h23mr25439691vkh.91.1560359443198; Wed, 12 Jun 2019 10:10:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqwr0EAa+Zm4CJj5/6h6vEF6SY1WfDTbabjvo52hs5KpjRW+GAUMV8kkhK+nfFEZMdOWs4jQ X-Received: by 2002:a1f:2117:: with SMTP id h23mr25439585vkh.91.1560359442117; Wed, 12 Jun 2019 10:10:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359442; cv=none; d=google.com; s=arc-20160816; b=zUPhCBYKtWxf1rPNZPDXX400pCyhqw+LkyiD2dy2XlSqQCZPosIVXHLXGDJqjOerck XNkgYY3jEnaVhaUawAo0FhUfU4z9/2Y+PX5ml/1MLG5llJhWnNUPEo/t6Fyuc7ag1OqZ ZjvDphcQYHJkOmJZmvhRqC1daGn9L+yCG+yU+kBJLCCn+PUC1UZA17Whts27fXRuxqPB OJHG0M/C7QHD/Gnl10rsfaESbBzVlccO/6D330ux9zuUNdXEFLQaamWYMPSDVTxr03Hv MeXu9qepCDVjgX8x1aQ3Mb+5Hflo9nJTsHEHPQhEiEkn5uXVunPX6/EWyl48IHCO+hfu i2ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=RzVt4uakvvgdusQzsQlIX4JlUcBfNHHcF9nlPs/PUMU=; b=IFjPtKAcgvbE8Ap+nuLfLkhIB21lyv48kWZxMnUwRUJw4lM73rab5yKhzEtYIx7B87 n7LzJTsyEJUeqgPokx5u2kXvzyPNnNF6g2jyqK0xk/X/o0bj472e7lo0PbeMa++czs50 FtdkK72TSP+tETjnL6u49Rs2/A4KmZjE51BkwI3kQLKTnDSeKDnKiD8pQ5rROsqS4NQe xRPjv35AA502Zu1zlTk9x5INYxSQea7Q9TXohIDHmVoSSn/qldV41aK8HcsQ6+uYu3vz vJFoWk9IotSVHLVnKeKqij3cQuSplaa2jU6O5OSRZaW1T/eFqeseQyo6364D/RmdO45a NIcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=NFsnPuRL; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com. [207.171.184.25]) by mx.google.com with ESMTPS id j6si75212vsn.398.2019.06.12.10.10.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:10:42 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) client-ip=207.171.184.25; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=NFsnPuRL; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359441; x=1591895441; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RzVt4uakvvgdusQzsQlIX4JlUcBfNHHcF9nlPs/PUMU=; b=NFsnPuRLoz2rQzSUcxDgw4QFiVucTe3Y0rUncLwwFvG2wRJcxSpMleav 2PQQKpXeaHT8o/maIIwze2tVXVnr8DHXf7HIV/5V3Qecu7IuUdF6wqLjy 47gD7WE6pQlEYGQVO6bkPC3ftXeSG8A1EVxJv/uv4k+aKVBIg3BX9zs1W 8=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="810038903" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-2c-4e7c8266.us-west-2.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP; 12 Jun 2019 17:10:39 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2c-4e7c8266.us-west-2.amazon.com (Postfix) with ESMTPS id E7643A24B9; Wed, 12 Jun 2019 17:10:38 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHAa4e017540; Wed, 12 Jun 2019 19:10:36 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHAa82017538; Wed, 12 Jun 2019 19:10:36 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse , Julian Stecklina Subject: [RFC 02/10] x86/speculation, mm: add process local virtual memory region Date: Wed, 12 Jun 2019 19:08:28 +0200 Message-Id: <20190612170834.14855-3-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The Linux kernel has a global address space that is the same for any kernel code. This address space becomes a liability in a world with processor information leak vulnerabilities, such as L1TF. With the right cache load gadget, an attacker-controlled hyperthread pair can leak arbitrary data via L1TF. Disabling hyperthreading is one recommended mitigation, but it comes with a large performance hit for a wide range of workloads. An alternative mitigation is to not make certain data in the kernel globally visible, but only when the kernel executes in the context of the process where this data belongs to. This patch introduces a region for process-local memory into the kernel's virtual address space. It has a length of 64 GiB (to give more than enough space while leaving enough room for KASLR) and will always occupy a pgd entry that is exclusive for process-local mappings (other pgds may point to shared page tables for the kernel space). Signed-off-by: Marius Hillenbrand Inspired-by: Julian Stecklina (while jsteckli@amazon.de) Cc: Alexander Graf Cc: David Woodhouse --- Documentation/x86/x86_64/mm.txt | 11 +++++-- arch/x86/Kconfig | 1 + arch/x86/include/asm/page_64.h | 4 +++ arch/x86/include/asm/pgtable_64_types.h | 12 ++++++++ arch/x86/kernel/head64.c | 8 +++++ arch/x86/mm/dump_pagetables.c | 9 ++++++ arch/x86/mm/fault.c | 19 ++++++++++++ arch/x86/mm/kaslr.c | 41 +++++++++++++++++++++++++ security/Kconfig | 18 +++++++++++ 9 files changed, 121 insertions(+), 2 deletions(-) diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt index 804f9426ed17..476519759cdc 100644 --- a/Documentation/x86/x86_64/mm.txt +++ b/Documentation/x86/x86_64/mm.txt @@ -40,7 +40,10 @@ ____________________________________________________________|___________________ ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base) ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base) - ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole + ffffeb0000000000 | -21 TB | ffffeb7fffffffff | 512 GB | ... unused hole + ffffeb8000000000 | -20.5 TB | ffffebffffffffff | 512 GB | process-local kernel memory (layout shared but mappings + | | | | exclusive to processes, needs an exclusive entry in the + | | | | top-level page table) ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory __________________|____________|__________________|_________|____________________________________________________________ | @@ -98,7 +101,11 @@ ____________________________________________________________|___________________ ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base) ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base) - ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole + ffd6000000000000 | -10.5 PB | ffd7ffffffffffff | 0.5 PB | ... unused hole + ffd8000000000000 | -10 PB | ffd8ffffffffffff | 256 TB | process-local kernel memory (layout shared but mappings + | | | | exclusive to processes, needs an exclusive entry in the + | | | | top-level page table) + ffd9000000000000 | -9.75 PB | ffdeffffffffffff | 1.5 PB | ... unused hole ffdf000000000000 | -8.25 PB | fffffdffffffffff | ~8 PB | KASAN shadow memory __________________|____________|__________________|_________|____________________________________________________________ | diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3b8cc39ae52d..9924d542d44a 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -32,6 +32,7 @@ config X86_64 select SWIOTLB select X86_DEV_DMA_OPS select ARCH_HAS_SYSCALL_WRAPPER + select ARCH_SUPPORTS_PROCLOCAL # # Arch settings diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index 939b1cff4a7b..e6f0d76de849 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -15,6 +15,10 @@ extern unsigned long page_offset_base; extern unsigned long vmalloc_base; extern unsigned long vmemmap_base; +#ifdef CONFIG_PROCLOCAL +extern unsigned long proclocal_base; +#endif + static inline unsigned long __phys_addr_nodebug(unsigned long x) { unsigned long y = x - __START_KERNEL_map; diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 14cd41b989d6..cb1b789a55c2 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -141,6 +141,18 @@ extern unsigned int ptrs_per_p4d; #define VMALLOC_END (VMALLOC_START + (VMALLOC_SIZE_TB << 40) - 1) +#ifdef CONFIG_PROCLOCAL +# define __PROCLOCAL_BASE_L4 0xffffeb8000000000UL +# define __PROCLOCAL_BASE_L5 0xffd8000000000000UL +# define PROCLOCAL_SIZE (64UL * 1024 * 1024 * 1024) + +# ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT +# define PROCLOCAL_START proclocal_base +# else /* CONFIG_DYNAMIC_MEMORY_LAYOUT */ +# define PROCLOCAL_START __PROCLOCAL_BASE_L4 +# endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */ +#endif /* CONFIG_PROCLOCAL */ + #define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE) /* The module sections ends with the start of the fixmap */ #define MODULES_END _AC(0xfffffffff4000000, UL) diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 509de5a2a122..490b5255aad3 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -59,6 +59,10 @@ unsigned long vmalloc_base __ro_after_init = __VMALLOC_BASE_L4; EXPORT_SYMBOL(vmalloc_base); unsigned long vmemmap_base __ro_after_init = __VMEMMAP_BASE_L4; EXPORT_SYMBOL(vmemmap_base); +#ifdef CONFIG_PROCLOCAL +unsigned long proclocal_base __ro_after_init = __PROCLOCAL_BASE_L4; +EXPORT_SYMBOL(proclocal_base); +#endif #endif #define __head __section(.head.text) @@ -94,6 +98,10 @@ static bool __head check_la57_support(unsigned long physaddr) *fixup_long(&page_offset_base, physaddr) = __PAGE_OFFSET_BASE_L5; *fixup_long(&vmalloc_base, physaddr) = __VMALLOC_BASE_L5; *fixup_long(&vmemmap_base, physaddr) = __VMEMMAP_BASE_L5; +#ifdef CONFIG_PROCLOCAL +#warning "Process-local memory with 5-level page tables is compile-tested only." + *fixup_long(&proclocal_base, physaddr) = __PROCLOCAL_BASE_L5; +#endif return true; } diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index abcb8d00b014..88fa2da94cfe 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -61,6 +61,9 @@ enum address_markers_idx { LOW_KERNEL_NR, VMALLOC_START_NR, VMEMMAP_START_NR, +#ifdef CONFIG_PROCLOCAL + PROCLOCAL_START_NR, +#endif #ifdef CONFIG_KASAN KASAN_SHADOW_START_NR, KASAN_SHADOW_END_NR, @@ -85,6 +88,9 @@ static struct addr_marker address_markers[] = { [LOW_KERNEL_NR] = { 0UL, "Low Kernel Mapping" }, [VMALLOC_START_NR] = { 0UL, "vmalloc() Area" }, [VMEMMAP_START_NR] = { 0UL, "Vmemmap" }, +#ifdef CONFIG_PROCLOCAL + [PROCLOCAL_START_NR] = { 0UL, "Process local" }, +#endif #ifdef CONFIG_KASAN /* * These fields get initialized with the (dynamic) @@ -622,6 +628,9 @@ static int __init pt_dump_init(void) address_markers[KASAN_SHADOW_START_NR].start_address = KASAN_SHADOW_START; address_markers[KASAN_SHADOW_END_NR].start_address = KASAN_SHADOW_END; #endif +#ifdef CONFIG_PROCLOCAL + address_markers[PROCLOCAL_START_NR].start_address = PROCLOCAL_START; +#endif #endif #ifdef CONFIG_X86_32 address_markers[VMALLOC_START_NR].start_address = VMALLOC_START; diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index ba51652fbd33..befea89c5d6f 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1171,6 +1171,15 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs) return true; } +static int fault_in_process_local(unsigned long address) +{ +#ifdef CONFIG_PROCLOCAL + return address >= PROCLOCAL_START && address < (PROCLOCAL_START + PROCLOCAL_SIZE); +#else + return false; +#endif +} + /* * Called for all faults where 'address' is part of the kernel address * space. Might get called for faults that originate from *code* that @@ -1214,6 +1223,16 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned long hw_error_code, if (spurious_kernel_fault(hw_error_code, address)) return; + /* + * Faults in process-local memory may be caused by process-local + * addresses leaking into other contexts. + * tbd: warn and handle gracefully. + */ + if (unlikely(fault_in_process_local(address))) { + pr_err("page fault in PROCLOCAL at %lx", address); + force_sig_fault(SIGSEGV, SEGV_MAPERR, (void __user *)address, current); + } + /* kprobes don't want to hook the spurious faults: */ if (kprobes_fault(regs)) return; diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index c455f1ffba29..395d8868aeb8 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -45,6 +45,9 @@ enum { PHYSMAP, VMALLOC, VMMEMMAP, +#ifdef CONFIG_PROCLOCAL + PROCLOCAL, +#endif }; /* @@ -59,6 +62,9 @@ static __initdata struct kaslr_memory_region { [PHYSMAP] = { &page_offset_base, 0 }, [VMALLOC] = { &vmalloc_base, 0 }, [VMMEMMAP] = { &vmemmap_base, 1 }, +#ifdef CONFIG_PROCLOCAL + [PROCLOCAL] = { &proclocal_base, 0 }, +#endif }; /* Get size in bytes used by the memory region */ @@ -76,6 +82,26 @@ static inline bool kaslr_memory_enabled(void) return kaslr_enabled() && !IS_ENABLED(CONFIG_KASAN); } +#ifdef CONFIG_PROCLOCAL +/* + * The process-local memory area must use an exclusive pgd entry. The area is + * allocated as 2x PGDIR_SIZE such that it contains at least one exclusive pgd + * entry. Shift the base address into that exclusive pgd. Keep the offset from + * randomization but make sure the whole actual process-local memory region fits + * into the pgd. + */ +static void adjust_proclocal_base(void) +{ + unsigned long size_tb = kaslr_regions[PROCLOCAL].size_tb; + proclocal_base += ((size_tb << TB_SHIFT) / 2); + if ((proclocal_base % PGDIR_SIZE) > (PGDIR_SIZE - PROCLOCAL_SIZE)) + proclocal_base -= PROCLOCAL_SIZE; + + BUILD_BUG_ON(2 * PROCLOCAL_SIZE >= PGDIR_SIZE); + BUG_ON(((proclocal_base % PGDIR_SIZE) + PROCLOCAL_SIZE) > PGDIR_SIZE); +} +#endif + /* Initialize base and padding for each memory region randomized with KASLR */ void __init kernel_randomize_memory(void) { @@ -103,6 +129,17 @@ void __init kernel_randomize_memory(void) kaslr_regions[PHYSMAP].size_tb = 1 << (__PHYSICAL_MASK_SHIFT - TB_SHIFT); kaslr_regions[VMALLOC].size_tb = VMALLOC_SIZE_TB; +#ifdef CONFIG_PROCLOCAL + /* + * Note that the process-local memory area must use a non-overlapping + * pgd. Thus, round up the size to 2 pgd entries and adjust the base + * address into the dedicated pgd below. With 4-level page tables, that + * keeps the size at the minium of 1 TiB used by the kernel. + */ + kaslr_regions[PROCLOCAL].size_tb = round_up(round_up(PROCLOCAL_SIZE, 2ULL< X-Patchwork-Id: 10990423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CDF1813AF for ; Wed, 12 Jun 2019 17:10:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1F4528A17 for ; Wed, 12 Jun 2019 17:10:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2FAE28A41; Wed, 12 Jun 2019 17:10:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD81B28A17 for ; Wed, 12 Jun 2019 17:10:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D93CF6B0269; Wed, 12 Jun 2019 13:10:55 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D43456B026A; Wed, 12 Jun 2019 13:10:55 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C0B696B026B; Wed, 12 Jun 2019 13:10:55 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by kanga.kvack.org (Postfix) with ESMTP id 9BF216B0269 for ; Wed, 12 Jun 2019 13:10:55 -0400 (EDT) Received: by mail-qt1-f197.google.com with SMTP id o16so15179650qtj.6 for ; Wed, 12 Jun 2019 10:10:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ux5wirT9zpu2UH49VFUeEuLSRIHtQX+sem65xFY2kAA=; b=NufE3aYZkp1aRpFef/3W0fKza0ROvVpPHcKxJo5NqFtY0pyqBhaoHFcARCxeckC1tx xW9MCTMd8jiSDAV4h8UPk4sA60ggMk8/M9znneizUbVCljliPSwL5Pyvt1pbrSmn9MmI lL6CVIfHCD25g2lhPf3BDYa/mkHvSCvh/KZdKKiAPrTw3N8h/p4wYQQ21uy/UUsCr5qh jPGKEmk2ulOtdaQsQoLks2ckdynafxf1Wo4FPMcqixSwUlXaKjJ4Oxi9oGyV9eocSyeH cRxhfdq4YBKdKyHia/AQitAZrxvuPk6iFEdX8HSbOADkfUOspvtMRr7OjMF+VY2CbdvY mxKA== X-Gm-Message-State: APjAAAUlOSOHmiiTRMBXOiSgTdiGrnrUmfRAVWPxcI6EWSYQgq3xD2tX iItmmQqPE6pB8oPuzuZgTYbtpZgLz7FCtW74dqtpUFn0QbrAj/OwXaE5cmQs38Jnxyzo6QtYNww 9AMDRpAFqbbT67F5pmwjn+DLiiwzrAL7AvBlAe4U7v/D+QG7I33InEmIG9DUR1A0U3w== X-Received: by 2002:ac8:303c:: with SMTP id f57mr71745048qte.294.1560359455340; Wed, 12 Jun 2019 10:10:55 -0700 (PDT) X-Google-Smtp-Source: APXvYqyeK/FuuqunV3BySWYKHa4qZUThV23uRL+msMxGFFuEw4JoHVwuTUW9fjhYoYzwd1xkfFCR X-Received: by 2002:ac8:303c:: with SMTP id f57mr71744982qte.294.1560359454309; Wed, 12 Jun 2019 10:10:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359454; cv=none; d=google.com; s=arc-20160816; b=Nv5c0KwEbg5bXftBxPM0jAIm2xBOKv6Oi9BcbmB5QSdGV/iuFISuPvUg5aWe+CdAUi y0HAvGUy3+Q5eg4/YUQI1KOy5NNpc0SplbquFpSnTvXPe9OOBnGJcb6GEetDlmRk8Fpz jSrCrynPfIZH8GrlK8Lrcx29I6/TxZXusbat0VB9ey1xAzppdA7psui4+XB+Z83fYA/j 1KRXzmSItp0wpGcZfJQ4R85v3jXTqSpI6AckTmPG2W4TiLWDGFfFi9+MKFOF5nqASQBu BkG3QDok0IUbCgYlWQO5qe/cYOBexbPOpnvb4G4x3x2utrvWJrlMuQfVakf1L3oIKwC/ zkfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=ux5wirT9zpu2UH49VFUeEuLSRIHtQX+sem65xFY2kAA=; b=NsenurRhWtb0iNlwluYulZ7EXgBYueATixpboHgNZ93LLGfq2fLzuDJ4FaPOZptoKP K/QVEQc1nF91DE/eKUTF1kQtoUmZMFh9pNTgM6rSk6JFDZOPsDUCu4TnsBHYklyKVFkw L7O09oRAXB7xR+Hk02zTgjXTQVzEng3/oXY07xmtNLjTLXG90lf1I2VoUoSDLd91ieZI ap9WAq1jbQj+fYuKoAkGQO9h7QRdxxlbbCrIx/xY127IrZ3IIBGWVPPL3yALguHKpQ/s 2T0sljUwbOksDR/RyLjnYGwkvghdTKLClvBOk5yxM+LEqIwQDkQBXFfxNpRtG+LPC4oJ UKFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=jfwfY+5h; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 72.21.196.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com. [72.21.196.25]) by mx.google.com with ESMTPS id h20si10314qkg.262.2019.06.12.10.10.54 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:10:54 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 72.21.196.25 as permitted sender) client-ip=72.21.196.25; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=jfwfY+5h; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 72.21.196.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359454; x=1591895454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ux5wirT9zpu2UH49VFUeEuLSRIHtQX+sem65xFY2kAA=; b=jfwfY+5hSFUhdpfBBBqFlENGGtZ/X1/3KA2dCCqGUgXhZAwVFPCEb0c6 kHIlHs8dYW9moZqejSl0kbYbI/WRAIlLXJUHdAws4bwYPKja0gP3VhYk9 xQrydJcwwfCsT6JyCIdHIsaTdxuqIsZml/du0Td/nYHx/krJqLZ/iA6sz Q=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="737183030" Received: from iad6-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-2b-859fe132.us-west-2.amazon.com) ([10.124.125.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP; 12 Jun 2019 17:10:52 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2b-859fe132.us-west-2.amazon.com (Postfix) with ESMTPS id BEF16222032; Wed, 12 Jun 2019 17:10:51 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHAnko017677; Wed, 12 Jun 2019 19:10:49 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHAnDT017675; Wed, 12 Jun 2019 19:10:49 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse Subject: [RFC 03/10] x86/mm, mm,kernel: add teardown for process-local memory to mm cleanup Date: Wed, 12 Jun 2019 19:08:30 +0200 Message-Id: <20190612170834.14855-4-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Process-local memory uses a dedicated pgd entry in kernel space and its own page table structure. Hook mm exit functions to cleanup those dedicated page tables. As a preparation, release any left-over process-local allocations in the address space. Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/include/asm/proclocal.h | 11 +++ arch/x86/mm/Makefile | 1 + arch/x86/mm/proclocal.c | 131 +++++++++++++++++++++++++++++++ include/linux/mm_types.h | 9 +++ include/linux/proclocal.h | 16 ++++ kernel/fork.c | 6 ++ mm/Makefile | 1 + mm/proclocal.c | 35 +++++++++ 8 files changed, 210 insertions(+) create mode 100644 arch/x86/include/asm/proclocal.h create mode 100644 arch/x86/mm/proclocal.c create mode 100644 include/linux/proclocal.h create mode 100644 mm/proclocal.c diff --git a/arch/x86/include/asm/proclocal.h b/arch/x86/include/asm/proclocal.h new file mode 100644 index 000000000000..a66983e49209 --- /dev/null +++ b/arch/x86/include/asm/proclocal.h @@ -0,0 +1,11 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + */ +#ifndef _ASM_X86_PROCLOCAL_H +#define _ASM_X86_PROCLOCAL_H + +struct mm_struct; + +void arch_proclocal_teardown_pages_and_pt(struct mm_struct *mm); + +#endif /* _ASM_X86_PROCLOCAL_H */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index e4ffcf74770e..a7c7111eb8f0 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -56,3 +56,4 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o obj-$(CONFIG_XPFO) += xpfo.o +obj-$(CONFIG_PROCLOCAL) += proclocal.o diff --git a/arch/x86/mm/proclocal.c b/arch/x86/mm/proclocal.c new file mode 100644 index 000000000000..c64a8ea6360d --- /dev/null +++ b/arch/x86/mm/proclocal.c @@ -0,0 +1,131 @@ +/* + * Architecture-specific code for handling process-local memory on x86-64. + * + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + */ + +#include +#include +#include +#include +#include + + +static void unmap_leftover_pages_pte(struct mm_struct *mm, pmd_t *pmd, + unsigned long addr, unsigned long end, + struct list_head *page_list) +{ + pte_t *pte; + struct page *page; + + for (pte = pte_offset_map(pmd, addr); + addr < end; addr += PAGE_SIZE, pte++) { + if (!pte_present(*pte)) + continue; + + page = pte_page(*pte); + pte_clear(mm, addr, pte); + set_direct_map_default_noflush(page); + + /* + * scrub page contents. since mm teardown happens from a + * different mm, we cannot just use the process-local virtual + * address; access the page via the physmap instead. note that + * there is a small time frame where leftover data is globally + * visible in the kernel address space. + * + * tbd in later commit: scrub the page via a temporary mapping + * in process-local memory area before re-attaching it to the + * physmap. + */ + memset(page_to_virt(page), 0, PAGE_SIZE); + + /* + * track page for cleanup later; + * note that the proclocal_next list is used only for regular + * kfree_proclocal, so ripping pages out now is fine. + */ + INIT_LIST_HEAD(&page->proclocal_next); + list_add_tail(&page->proclocal_next, page_list); + } +} + +/* + * Walk through process-local mappings on each page table level. Avoid code + * duplication and use a macro to generate one function for each level. + * + * The macro generates a function for page table level LEVEL. The function is + * passed a pointer to the entry in the page table level ABOVE and recurses into + * the page table level BELOW. + */ +#define UNMAP_LEFTOVER_LEVEL(LEVEL, ABOVE, BELOW) \ + static void unmap_leftover_pages_ ## LEVEL (struct mm_struct *mm, ABOVE ## _t *ABOVE, \ + unsigned long addr, unsigned long end, \ + struct list_head *page_list) \ + { \ + LEVEL ## _t *LEVEL = LEVEL ## _offset(ABOVE, addr); \ + unsigned long next; \ + do { \ + next = LEVEL ## _addr_end(addr, end); \ + if (LEVEL ## _present(*LEVEL)) \ + unmap_leftover_pages_## BELOW (mm, LEVEL, addr, next, page_list); \ + } while (LEVEL++, addr = next, addr < end); \ + } + +UNMAP_LEFTOVER_LEVEL(pmd, pud, pte) +UNMAP_LEFTOVER_LEVEL(pud, p4d, pmd) +UNMAP_LEFTOVER_LEVEL(p4d, pgd, pud) +#undef UNMAP_LEFTOVER_LEVEL + +extern void proclocal_release_pages(struct list_head *pages); + +static void unmap_free_leftover_proclocal_pages(struct mm_struct *mm) +{ + LIST_HEAD(page_list); + unsigned long addr = PROCLOCAL_START, next; + unsigned long end = PROCLOCAL_START + PROCLOCAL_SIZE; + + /* + * Walk page tables in process-local memory area and handle leftover + * process-local pages. Note that we cannot use the kernel's + * walk_page_range, because that function assumes walking across vmas. + */ + spin_lock(&mm->page_table_lock); + do { + pgd_t *pgd = pgd_offset(mm, addr); + next = pgd_addr_end(addr, end); + + if (pgd_present(*pgd)) { + unmap_leftover_pages_p4d(mm, pgd, addr, next, &page_list); + } + addr = next; + } while (addr < end); + spin_unlock(&mm->page_table_lock); + /* + * Flush any mappings of process-local pages from the TLBs, so that we + * can release the pages afterwards. + */ + flush_tlb_mm_range(mm, PROCLOCAL_START, end, PAGE_SHIFT, false); + proclocal_release_pages(&page_list); +} + +static void arch_proclocal_teardown_pt(struct mm_struct *mm) +{ + struct mmu_gather tlb; + /* + * clean up page tables for the whole pgd used exclusively by + * process-local memory. + */ + unsigned long proclocal_base_pgd = PROCLOCAL_START & PGDIR_MASK; + unsigned long proclocal_end_pgd = proclocal_base_pgd + PGDIR_SIZE; + + tlb_gather_mmu(&tlb, mm, proclocal_base_pgd, proclocal_end_pgd); + free_pgd_range(&tlb, proclocal_base_pgd, proclocal_end_pgd, 0, 0); + tlb_finish_mmu(&tlb, proclocal_base_pgd, proclocal_end_pgd); +} + +void arch_proclocal_teardown_pages_and_pt(struct mm_struct *mm) +{ + unmap_free_leftover_proclocal_pages(mm); + arch_proclocal_teardown_pt(mm); +} diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2fe4dbfcdebd..1cb9243dd299 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -158,6 +158,15 @@ struct page { /** @rcu_head: You can use this to free a page by RCU. */ struct rcu_head rcu_head; + +#ifdef CONFIG_PROCLOCAL + struct { /* PROCLOCAL pages */ + struct list_head proclocal_next; /* track pages in one allocation */ + unsigned long _proclocal_pad_1; /* mapping */ + /* head page of an allocation stores its length */ + size_t proclocal_nr_pages; + }; +#endif }; union { /* This union is 4 bytes in size. */ diff --git a/include/linux/proclocal.h b/include/linux/proclocal.h new file mode 100644 index 000000000000..9dae140c0796 --- /dev/null +++ b/include/linux/proclocal.h @@ -0,0 +1,16 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + */ +#ifndef _PROCLOCAL_H +#define _PROCLOCAL_H + +#ifdef CONFIG_PROCLOCAL + +struct mm_struct; + +void proclocal_mm_exit(struct mm_struct *mm); +#else /* !CONFIG_PROCLOCAL */ +static inline void proclocal_mm_exit(struct mm_struct *mm) { } +#endif + +#endif /* _PROCLOCAL_H */ diff --git a/kernel/fork.c b/kernel/fork.c index 6e37f5626417..caca6b16ee1e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -92,6 +92,7 @@ #include #include #include +#include #include #include @@ -1051,6 +1052,11 @@ static inline void __mmput(struct mm_struct *mm) exit_mmap(mm); mm_put_huge_zero_page(mm); set_mm_exe_file(mm, NULL); + /* + * No real users of this address space left, dropping process-local + * mappings. + */ + proclocal_mm_exit(mm); if (!list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); list_del(&mm->mmlist); diff --git a/mm/Makefile b/mm/Makefile index e99e1e6ae5ae..029d7e2ee80b 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -100,3 +100,4 @@ obj-$(CONFIG_PERCPU_STATS) += percpu-stats.o obj-$(CONFIG_HMM) += hmm.o obj-$(CONFIG_MEMFD_CREATE) += memfd.o obj-$(CONFIG_XPFO) += xpfo.o +obj-$(CONFIG_PROCLOCAL) += proclocal.o diff --git a/mm/proclocal.c b/mm/proclocal.c new file mode 100644 index 000000000000..72c485c450bf --- /dev/null +++ b/mm/proclocal.c @@ -0,0 +1,35 @@ +/* + * mm/proclocal.c + * + * The code in this file implements process-local mappings in the Linux kernel + * address space. This memory is only usable in the process context. With memory + * not globally visible in the kernel, it cannot easily be prefetched and leaked + * via L1TF. + * + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + */ +#include +#include +#include +#include +#include + +#include +#include +#include + +void proclocal_release_pages(struct list_head *pages) +{ + struct page *pos, *n; + list_for_each_entry_safe(pos, n, pages, proclocal_next) { + list_del(&pos->proclocal_next); + __free_page(pos); + } +} + +void proclocal_mm_exit(struct mm_struct *mm) +{ + pr_debug("proclocal_mm_exit for mm %p pgd %p (current is %p)\n", mm, mm->pgd, current); + + arch_proclocal_teardown_pages_and_pt(mm); +} From patchwork Wed Jun 12 17:08:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990427 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 836361395 for ; Wed, 12 Jun 2019 17:11:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A3EA289EE for ; Wed, 12 Jun 2019 17:11:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5DC4D28A28; Wed, 12 Jun 2019 17:11:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 93467289EE for ; Wed, 12 Jun 2019 17:11:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E9F16B026A; Wed, 12 Jun 2019 13:11:06 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 99A666B026B; Wed, 12 Jun 2019 13:11:06 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83A386B026C; Wed, 12 Jun 2019 13:11:06 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by kanga.kvack.org (Postfix) with ESMTP id 5B5826B026A for ; Wed, 12 Jun 2019 13:11:06 -0400 (EDT) Received: by mail-vk1-f199.google.com with SMTP id 184so5316172vku.17 for ; Wed, 12 Jun 2019 10:11:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=0sTYNypMH60bWy7mpyTtzOSWmLkHItIqbVNN5O//Yh4=; b=N62V277lb72uOsISKo53PmCz1vmm2pQ76TGvdj8Uo+PVqZYXyZol3Rk1EClZQ9oMle GFHJZbCyhZLIJVAtChkt/8BII9RfqpvvGaIYB+gdyd39lUJVLPNSI+mZgyb3OBk31rwW iJ+gKgQK3CW8MbQt8Y+372KXA2oygk9WvVyLvIzI2naAhioOP2QIQYtGh3vfy06hz4Xk BF30Iy40QAXHYnatoHdmCF6kVhBXijZDx0zugHVfYH8pA3FUJQZzRh8xDOoEL7t9Nuir 5Oq1Jtvufn5p5J/7yKDt42W/2a7BIhnSidYhPuGQw+4nb9px6w1QwrqnGmKAtExAES7D JDZQ== X-Gm-Message-State: APjAAAX8tPkwQPWZSSWYxIBabkj5HEgOQYpY3T/RDJJ/o6NRaHVz5rcF FIAL0J9jrVGoleduefA7fwifQJC2c+QhmOohhcK/GHvs3viyIl5o6DBuHFQSNImBOxMLoOpntCn Ev8ypVprL8nMYB90n4N/ZBky9XhetPB2paGp0hDYf2umSNbOBVu5Nes985ZuOAjG2cA== X-Received: by 2002:a67:3310:: with SMTP id z16mr26391095vsz.75.1560359466027; Wed, 12 Jun 2019 10:11:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqzgdPbu1P3WKo4zfgFwhzp4vw3SGdSjeRjGlmJijXPx/Mbp/hTT8qDJ1YIDfUUB9FKJ/961 X-Received: by 2002:a67:3310:: with SMTP id z16mr26390998vsz.75.1560359465199; Wed, 12 Jun 2019 10:11:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359465; cv=none; d=google.com; s=arc-20160816; b=k59xw9pTyoT9ypTwzaTqv6lGBF5uG7MjBKA8jn1z2J6mNbC74awfq9yRzfXK33uyR9 3jNax9eTysrzdkNbOVC/a6hHG24TfnvcrJZGOfWeT+xp0h9ErtdHUZFrGJsKemBFdvR8 cbUFxNXZio0s533/YAjpjbRb1JFpeptddPAhRj8hq5j/cgYccJx9z/iZVZXO4SmOGowv 1nWjC47jqK4Jlm77ABFK/eMHj9GpCEboEt6S+cOavskJ8fBs6zvTDbimlGKadZ8CPocT reWCX3vKBpDRGVi1bfOaWs5aKHcPt8pNG+g1XblmSrERLiwT2oUiMecDnaDdCNtkyXAH zkOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=0sTYNypMH60bWy7mpyTtzOSWmLkHItIqbVNN5O//Yh4=; b=JeMXA2C4t07l18xfsjPNkNRHwzLJ2nYTXzmz70D8Hda++mEEEhnXw5GIbTlw2Ejx3o h4dIsiZ37WFW4ZQfSLa9j07a9ZHyD+4dmpAyjTUFr/m6GYIio+eIN734X4NFcwlOlgWi 4Qj8kB0lp+tWIlFwbVqgX++xvuuJz/9Z6asb1RirRqIsBm2Pr/o4hsj4WxSMB/dXs7NX UBym20091ZV0+wELokx5NsTq/CUT4G+jv2qxkJBOGBmiamLrhkWhvbyLxNgoqQ1ckWhm rK+xC3ryd25Mc7Sa2302kJuHc35vJJa8RQx046wFv4UerC43qg+ijCIvZZ33cuwHf8Tk ZgwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=MBgrCVd8; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com. [207.171.184.25]) by mx.google.com with ESMTPS id j65si57881uaj.137.2019.06.12.10.11.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:11:05 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) client-ip=207.171.184.25; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=MBgrCVd8; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359464; x=1591895464; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0sTYNypMH60bWy7mpyTtzOSWmLkHItIqbVNN5O//Yh4=; b=MBgrCVd8jCI4l1yaw0sMHNDBjU5mm9Ebt7iuhz3JWBMyZCcE+oBHcksX f+UfS6SbbfIRlhR3qnbPL8WTVS9WeouoI7sS69+1wNNWoQWUCp+WJT+0F DfMXlSqX2h/838hVw+BHUvRnd5ldag1i75kYxYlYtBvItJi0kkIEu82r+ k=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="810038987" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-2a-f14f4a47.us-west-2.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP; 12 Jun 2019 17:11:03 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2a-f14f4a47.us-west-2.amazon.com (Postfix) with ESMTPS id 4CB18A228E; Wed, 12 Jun 2019 17:11:02 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHB090017869; Wed, 12 Jun 2019 19:11:00 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHAxh5017849; Wed, 12 Jun 2019 19:10:59 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse Subject: [RFC 04/10] mm: allocate virtual space for process-local memory Date: Wed, 12 Jun 2019 19:08:32 +0200 Message-Id: <20190612170834.14855-5-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Implement first half of kmalloc and kfree for process-local memory, which deals with allocating virtual address ranges in the process-local memory area. While the process-local mappings will be visible only in a single address space, the address of each allocation is still unique to aid in debugging (e.g., to see page faults instead of silent invalid accesses). For that purpose, use a global allocator for virtual address ranges of process-local mappings. Note that the single centralized lock is good enough for our use case. Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/mm/proclocal.c | 7 +- include/linux/mm_types.h | 4 + include/linux/proclocal.h | 19 +++++ mm/proclocal.c | 150 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 179 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/proclocal.c b/arch/x86/mm/proclocal.c index c64a8ea6360d..dd641995cc9f 100644 --- a/arch/x86/mm/proclocal.c +++ b/arch/x86/mm/proclocal.c @@ -10,6 +10,8 @@ #include #include +extern void handle_proclocal_page(struct mm_struct *mm, struct page *page, + unsigned long addr); static void unmap_leftover_pages_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -27,6 +29,8 @@ static void unmap_leftover_pages_pte(struct mm_struct *mm, pmd_t *pmd, pte_clear(mm, addr, pte); set_direct_map_default_noflush(page); + /* callback to non-arch allocator */ + handle_proclocal_page(mm, page, addr); /* * scrub page contents. since mm teardown happens from a * different mm, we cannot just use the process-local virtual @@ -126,6 +130,7 @@ static void arch_proclocal_teardown_pt(struct mm_struct *mm) void arch_proclocal_teardown_pages_and_pt(struct mm_struct *mm) { - unmap_free_leftover_proclocal_pages(mm); + if (mm->proclocal_nr_pages) + unmap_free_leftover_proclocal_pages(mm); arch_proclocal_teardown_pt(mm); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1cb9243dd299..4bd8737cc7a6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -510,6 +510,10 @@ struct mm_struct { /* HMM needs to track a few things per mm */ struct hmm *hmm; #endif + +#ifdef CONFIG_PROCLOCAL + size_t proclocal_nr_pages; +#endif } __randomize_layout; /* diff --git a/include/linux/proclocal.h b/include/linux/proclocal.h index 9dae140c0796..c408e0d1104c 100644 --- a/include/linux/proclocal.h +++ b/include/linux/proclocal.h @@ -8,8 +8,27 @@ struct mm_struct; +void *kmalloc_proclocal(size_t size); +void *kzalloc_proclocal(size_t size); +void kfree_proclocal(void *vaddr); + void proclocal_mm_exit(struct mm_struct *mm); #else /* !CONFIG_PROCLOCAL */ +static inline void *kmalloc_proclocal(size_t size) +{ + return kmalloc(size, GFP_KERNEL); +} + +static inline void * kzalloc_proclocal(size_t size) +{ + return kzalloc(size, GFP_KERNEL); +} + +static inline void kfree_proclocal(void *vaddr) +{ + kfree(vaddr); +} + static inline void proclocal_mm_exit(struct mm_struct *mm) { } #endif diff --git a/mm/proclocal.c b/mm/proclocal.c index 72c485c450bf..7a6217faf765 100644 --- a/mm/proclocal.c +++ b/mm/proclocal.c @@ -8,6 +8,7 @@ * * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. */ +#include #include #include #include @@ -18,6 +19,43 @@ #include #include +static pte_t *pte_lookup_map(struct mm_struct *mm, unsigned long kvaddr) +{ + pgd_t *pgd = pgd_offset(mm, kvaddr); + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + if (IS_ERR_OR_NULL(pgd) || !pgd_present(*pgd)) + return ERR_PTR(-1); + + p4d = p4d_offset(pgd, kvaddr); + if (IS_ERR_OR_NULL(p4d) || !p4d_present(*p4d)) + return ERR_PTR(-1); + + pud = pud_offset(p4d, kvaddr); + if (IS_ERR_OR_NULL(pud) || !pud_present(*pud)) + return ERR_PTR(-1); + + pmd = pmd_offset(pud, kvaddr); + if (IS_ERR_OR_NULL(pmd) || !pmd_present(*pmd)) + return ERR_PTR(-1); + + return pte_offset_map(pmd, kvaddr); +} + +static struct page *proclocal_find_first_page(struct mm_struct *mm, const void *kvaddr) +{ + pte_t *ptep = pte_lookup_map(mm, (unsigned long) kvaddr); + + if(IS_ERR_OR_NULL(ptep)) + return NULL; + if (!pte_present(*ptep)) + return NULL; + + return pfn_to_page(pte_pfn(*ptep)); +} + void proclocal_release_pages(struct list_head *pages) { struct page *pos, *n; @@ -27,9 +65,121 @@ void proclocal_release_pages(struct list_head *pages) } } +static DEFINE_SPINLOCK(proclocal_lock); +static struct gen_pool *allocator; + +static int proclocal_allocator_init(void) +{ + int rc; + + allocator = gen_pool_create(PAGE_SHIFT, -1); + if (unlikely(IS_ERR(allocator))) + return PTR_ERR(allocator); + if (!allocator) + return -1; + + rc = gen_pool_add(allocator, PROCLOCAL_START, PROCLOCAL_SIZE, -1); + + if (rc) + gen_pool_destroy(allocator); + + return rc; +} +late_initcall(proclocal_allocator_init); + +static void *alloc_virtual(size_t nr_pages) +{ + void *kvaddr; + spin_lock(&proclocal_lock); + kvaddr = (void *)gen_pool_alloc(allocator, nr_pages * PAGE_SIZE); + spin_unlock(&proclocal_lock); + return kvaddr; +} + +static void free_virtual(const void *kvaddr, size_t nr_pages) +{ + spin_lock(&proclocal_lock); + gen_pool_free(allocator, (unsigned long)kvaddr, + nr_pages * PAGE_SIZE); + spin_unlock(&proclocal_lock); +} + +void *kmalloc_proclocal(size_t size) +{ + void *kvaddr = NULL; + size_t nr_pages = round_up(size, PAGE_SIZE) / PAGE_SIZE; + size_t nr_pages_virtual = nr_pages + 1; /* + guard page */ + + BUG_ON(!current); + if (!size) + return ZERO_SIZE_PTR; + might_sleep(); + + kvaddr = alloc_virtual(nr_pages_virtual); + + if (IS_ERR_OR_NULL(kvaddr)) + return kvaddr; + + /* tbd: subsequent patch will allocate and map physical pages */ + + return kvaddr; +} +EXPORT_SYMBOL(kmalloc_proclocal); + +void * kzalloc_proclocal(size_t size) +{ + void * kvaddr = kmalloc_proclocal(size); + + if (!IS_ERR_OR_NULL(kvaddr)) + memset(kvaddr, 0, size); + return kvaddr; +} +EXPORT_SYMBOL(kzalloc_proclocal); + +void kfree_proclocal(void *kvaddr) +{ + struct page *first_page; + int nr_pages; + struct mm_struct *mm; + + if (!kvaddr || kvaddr == ZERO_SIZE_PTR) + return; + + pr_debug("kfree for %p (current %p mm %p)\n", kvaddr, + current, current ? current->mm : 0); + + BUG_ON((unsigned long)kvaddr < PROCLOCAL_START); + BUG_ON((unsigned long)kvaddr >= (PROCLOCAL_START + PROCLOCAL_SIZE)); + BUG_ON(!current); + + mm = current->mm; + down_write(&mm->mmap_sem); + first_page = proclocal_find_first_page(mm, kvaddr); + if (IS_ERR_OR_NULL(first_page)) { + pr_err("double-free?!\n"); + BUG(); + } /* double-free? */ + nr_pages = first_page->proclocal_nr_pages; + BUG_ON(!nr_pages); + mm->proclocal_nr_pages -= nr_pages; + /* subsequent patch will unmap and release physical pages */ + up_write(&mm->mmap_sem); + + free_virtual(kvaddr, nr_pages + 1); +} +EXPORT_SYMBOL(kfree_proclocal); + void proclocal_mm_exit(struct mm_struct *mm) { pr_debug("proclocal_mm_exit for mm %p pgd %p (current is %p)\n", mm, mm->pgd, current); arch_proclocal_teardown_pages_and_pt(mm); } + +void handle_proclocal_page(struct mm_struct *mm, struct page *page, + unsigned long addr) +{ + if (page->proclocal_nr_pages) { + free_virtual((void *)addr, page->proclocal_nr_pages + 1); + } +} From patchwork Wed Jun 12 17:08:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990431 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 087BA13AF for ; Wed, 12 Jun 2019 17:11:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E318F2817F for ; Wed, 12 Jun 2019 17:11:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D4B7328A17; Wed, 12 Jun 2019 17:11:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83D682817F for ; Wed, 12 Jun 2019 17:11:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9739F6B026B; Wed, 12 Jun 2019 13:11:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 925046B026C; Wed, 12 Jun 2019 13:11:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7ECDB6B026D; Wed, 12 Jun 2019 13:11:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ua1-f69.google.com (mail-ua1-f69.google.com [209.85.222.69]) by kanga.kvack.org (Postfix) with ESMTP id 596EB6B026B for ; Wed, 12 Jun 2019 13:11:21 -0400 (EDT) Received: by mail-ua1-f69.google.com with SMTP id q12so1468543uad.0 for ; Wed, 12 Jun 2019 10:11:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Ao+6+ybAOyd/41QkzFZKGeI5GKaYYHvHpMzftf308bY=; b=pEdBH7bqSOkraUknIF9eFfSgCx0NhJICDFOX4vXTgnLy18Z94imnvkob+CAiWo/67t L8AemQeHNloRZyDd0LEHURZkwMUE3uPBVhsQu7uz/iQDhcaOFoheCEXir/ULls2SrykZ RjbBn2unpzZvUakAPkcxulVSIBGVK6WoJr1HEF2MHhnXFqwIdrY546ix6EGJazO+ETn7 vd2GOs3D8yOqZiVeZB0qfa4hFqfwF2Y49fZomX9w0lRcuCLO/Hv0YI/eyrr8tgXpPgLh yZzBb9wTjav1lbVLCkL2sWEua7kEh/xKNSPd+LMsnWbe/rk+QBhsfW20RYslaPMNBXcS KkPg== X-Gm-Message-State: APjAAAXmj6Tf+RUp49JAN1B/kqartwT8FX/F6huFcTeJb/GTnGBEjFO6 KM3R/+LE4xJxyJB310Cypbl7RotYZm0Po8nliE3Fo6FjEEpoorlLxCJAa1cp0OdFgXLo7r6Ushv ukzi7RxTmeUOC84FA7gCR+KNE6C4YK3908Rte6yLny5iQBnWAwQ4ldHek0x5JKO2h/g== X-Received: by 2002:a67:b147:: with SMTP id z7mr32573687vsl.228.1560359481031; Wed, 12 Jun 2019 10:11:21 -0700 (PDT) X-Google-Smtp-Source: APXvYqz5IKPhVb46n+uGJA7nb0MTfZgQlbEkDnlCQx2OU9uUH8BWvCtvSMWE/8ELbPEAc/FQlkr6 X-Received: by 2002:a67:b147:: with SMTP id z7mr32573562vsl.228.1560359479863; Wed, 12 Jun 2019 10:11:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359479; cv=none; d=google.com; s=arc-20160816; b=SvzshAg7XQI8gseqF+CLgEDvPBtjPCsInm97b6AGKhbP5no7cM+fviiuEDp5Ru3IFw LOQA/mhNmYZnYeCYzuQvP2cHiecxpRCMqDzlEqpagCX5lP1QZ35gQN5+RLFGMwFpW2hl yn4owITXV+yHLCE+COxATeovC71/hj1H6IW1NL1c1I6GbhJa+dtOhw8c2dFCe3/K4uj+ uyzSonFkqKiMRS2Pqfrnfbjh7h6el3v0gCRMePC5EDq1seA+Mtk5aJrfAJ/zb1KHDEGa QpSo4VCHvQnnIfLihCIMGgKEbtD7WvL2NlYekzZXatX+OcJc4AXh6aliw3aHpbvaTgDd sfMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=Ao+6+ybAOyd/41QkzFZKGeI5GKaYYHvHpMzftf308bY=; b=lslqvxcOavz98yEyYdCkb0CR3bVfUMElgR8L1TpmLc9Oq48WUunlfPTI1Abw8L+uQy lSDUxPxQUJ3FWIj4BBL//7kkmaBKN4zEW8tYnbnM3eyzwHxtztoD0yTQyHnxCaE7MeBn g2v/LscbyHs0i2HoJNFFueqO6Kr6kHXyJKcNGcH4L8TpczXP5Hk4D7YehQXKrSGdeN78 VN46s6DVIKfqZeb5ptLyfJqsXIHvgZPz1en3dJfeOL8eIIBtrPh/ZxjpjayRM9rhyun9 uZrb/xy7xTYZoDSpYjsuUU6+6BQl0M5JZeeMA5Sk4gDTw8YsQ4ZaUt3Uhq5Q9UiHXEzp tYIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=D86NXbeP; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com. [207.171.190.10]) by mx.google.com with ESMTPS id h25si82042vsp.281.2019.06.12.10.11.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:11:19 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) client-ip=207.171.190.10; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=D86NXbeP; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359479; x=1591895479; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ao+6+ybAOyd/41QkzFZKGeI5GKaYYHvHpMzftf308bY=; b=D86NXbePvig4zqCsWHpPh2FXGmmNw/DWihqZVQZH9ec53iGzoXOPfIP6 9sfLb7Ga6mENhULAKorGZj8Z6m0CafygaLya+6n7oXjKM3MXUFNHN0+20 7x8uCpYc1CwWNtvDjrbXIZAd/V2cc7TSCHob37KBbeh+plT8KwJBskEF8 Y=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="805048692" Received: from sea3-co-svc-lb6-vlan2.sea.amazon.com (HELO email-inbound-relay-2b-4ff6265a.us-west-2.amazon.com) ([10.47.22.34]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 12 Jun 2019 17:11:16 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan2.amazon.com [10.247.140.66]) by email-inbound-relay-2b-4ff6265a.us-west-2.amazon.com (Postfix) with ESMTPS id F339EA1956; Wed, 12 Jun 2019 17:11:15 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHBDtZ018049; Wed, 12 Jun 2019 19:11:13 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHBDRf018047; Wed, 12 Jun 2019 19:11:13 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse Subject: [RFC 05/10] mm: allocate/release physical pages for process-local memory Date: Wed, 12 Jun 2019 19:08:34 +0200 Message-Id: <20190612170834.14855-6-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Implement second half of kmalloc and kfree for process-local memory, which allocates physical pages, maps them into the kernel virtual address space of the current process, and removes them from the kernel's shared direct physical mapping. On kfree, the code performs that sequence in reverse, after scrubbing the pages' contents. Note that both the allocation and free path require TLB flushes, to flush remaining mappings in the direct physical mappings (on allocation) and in the process-local memory (on release). Aim to keep the impact of these flushes minimal by flushing only necessary address ranges. The allocation only handles the smallest page size (i.e., 4 KiB), no huge pages, because in our use case, the size of individual allocations is in the order of 4 KiB. Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- mm/proclocal.c | 167 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 165 insertions(+), 2 deletions(-) diff --git a/mm/proclocal.c b/mm/proclocal.c index 7a6217faf765..26bc1f3f68a2 100644 --- a/mm/proclocal.c +++ b/mm/proclocal.c @@ -56,6 +56,70 @@ static struct page *proclocal_find_first_page(struct mm_struct *mm, const void * return pfn_to_page(pte_pfn(*ptep)); } +/* + * Lookup PTE for a given virtual address. Allocate page table structures, if + * they are not present yet. + */ +static pte_t *pte_lookup_alloc_map(struct mm_struct *mm, unsigned long kvaddr) +{ + pgd_t *pgd = pgd_offset(mm, kvaddr); + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + p4d = p4d_alloc(mm, pgd, kvaddr); + if (IS_ERR_OR_NULL(p4d)) + return (pte_t *)p4d; + + pud = pud_alloc(mm, p4d, kvaddr); + if (IS_ERR_OR_NULL(pud)) + return (pte_t *)pud; + + pmd = pmd_alloc(mm, pud, kvaddr); + if (IS_ERR_OR_NULL(pmd)) + return (pte_t *)pmd; + + return pte_alloc_map(mm, pmd, kvaddr); +} + +static int proclocal_map_notlbflush(struct mm_struct *mm, struct page *page, void *kvaddr) +{ + int rc; + pte_t *ptep = pte_lookup_alloc_map(mm, (unsigned long)kvaddr); + + if (IS_ERR_OR_NULL(ptep)) { + pr_err("failed to pte_lookup_alloc_map, ptep=0x%lx\n", + (unsigned long)ptep); + return ptep ? PTR_ERR(ptep) : -ENOMEM; + } + + set_pte(ptep, mk_pte(page, kmap_prot)); + rc = set_direct_map_invalid_noflush(page); + if (rc) + pte_clear(mm, (unsigned long)kvaddr, ptep); + else + pr_debug("map pfn %lx at %p for mm %p pgd %p\n", page_to_pfn(page), kvaddr, mm, mm->pgd); + return rc; +} + +static void proclocal_unmap_page_notlbflush(struct mm_struct *mm, void *vaddr) +{ + pte_t *ptep = pte_lookup_map(mm, (unsigned long)vaddr); + pte_t pte; + struct page *page; + + BUG_ON(IS_ERR_OR_NULL(ptep)); + BUG_ON(!pte_present(*ptep)); // already cleared?! + + /* scrub page contents */ + memset(vaddr, 0, PAGE_SIZE); + + pte = ptep_get_and_clear(mm, (unsigned long)vaddr, ptep); + page = pfn_to_page(pte_pfn(pte)); + + BUG_ON(set_direct_map_default_noflush(page)); /* should never fail for mapped 4K-pages */ +} + void proclocal_release_pages(struct list_head *pages) { struct page *pos, *n; @@ -65,6 +129,76 @@ void proclocal_release_pages(struct list_head *pages) } } +static void proclocal_release_pages_incl_head(struct list_head *pages) +{ + proclocal_release_pages(pages); + /* the list_head itself is embedded in a struct page we want to release. */ + __free_page(list_entry(pages, struct page, proclocal_next)); +} + +struct physmap_tlb_flush { + unsigned long start; + unsigned long end; +}; + +static inline void track_page_to_flush(struct physmap_tlb_flush *flush, struct page *page) +{ + const unsigned long page_start = (unsigned long)page_to_virt(page); + const unsigned long page_end = page_start + PAGE_SIZE; + + if (page_start < flush->start) + flush->start = page_start; + if (page_end > flush->end) + flush->end = page_end; +} + +static int alloc_and_map_proclocal_pages(struct mm_struct *mm, void *kvaddr, size_t nr_pages) +{ + int rc; + size_t i, j; + struct page *page; + struct list_head *pages_list = NULL; + struct physmap_tlb_flush flush = { -1, 0 }; + + for (i = 0; i < nr_pages; i++) { + page = alloc_page(GFP_KERNEL); + + if (!page) { + rc = -ENOMEM; + goto unmap_release; + } + + rc = proclocal_map_notlbflush(mm, page, kvaddr + i * PAGE_SIZE); + if (rc) { + __free_page(page); + goto unmap_release; + } + + track_page_to_flush(&flush, page); + INIT_LIST_HEAD(&page->proclocal_next); + /* track allocation in first struct page */ + if (!pages_list) { + pages_list = &page->proclocal_next; + page->proclocal_nr_pages = nr_pages; + } else { + list_add_tail(&page->proclocal_next, pages_list); + page->proclocal_nr_pages = 0; + } + } + + /* flush direct mappings of allocated pages from TLBs. */ + flush_tlb_kernel_range(flush.start, flush.end); + return 0; + +unmap_release: + for (j = 0; j < i; j++) + proclocal_unmap_page_notlbflush(mm, kvaddr + j * PAGE_SIZE); + + if (pages_list) + proclocal_release_pages_incl_head(pages_list); + return rc; +} + static DEFINE_SPINLOCK(proclocal_lock); static struct gen_pool *allocator; @@ -106,9 +240,11 @@ static void free_virtual(const void *kvaddr, size_t nr_pages) void *kmalloc_proclocal(size_t size) { + int rc; void *kvaddr = NULL; size_t nr_pages = round_up(size, PAGE_SIZE) / PAGE_SIZE; size_t nr_pages_virtual = nr_pages + 1; /* + guard page */ + struct mm_struct *mm; BUG_ON(!current); if (!size) @@ -120,7 +256,18 @@ void *kmalloc_proclocal(size_t size) if (IS_ERR_OR_NULL(kvaddr)) return kvaddr; - /* tbd: subsequent patch will allocate and map physical pages */ + mm = current->mm; + down_write(&mm->mmap_sem); + rc = alloc_and_map_proclocal_pages(mm, kvaddr, nr_pages); + if (!rc) + mm->proclocal_nr_pages += nr_pages; + up_write(&mm->mmap_sem); + + if (unlikely(rc)) + kvaddr = ERR_PTR(rc); + + pr_debug("allocated %zd bytes at %p (current %p mm %p)\n", size, kvaddr, + current, current ? current->mm : 0); return kvaddr; } @@ -138,6 +285,7 @@ EXPORT_SYMBOL(kzalloc_proclocal); void kfree_proclocal(void *kvaddr) { + int i; struct page *first_page; int nr_pages; struct mm_struct *mm; @@ -152,8 +300,10 @@ void kfree_proclocal(void *kvaddr) BUG_ON((unsigned long)kvaddr >= (PROCLOCAL_START + PROCLOCAL_SIZE)); BUG_ON(!current); + might_sleep(); mm = current->mm; down_write(&mm->mmap_sem); + first_page = proclocal_find_first_page(mm, kvaddr); if (IS_ERR_OR_NULL(first_page)) { pr_err("double-free?!\n"); @@ -162,9 +312,22 @@ void kfree_proclocal(void *kvaddr) nr_pages = first_page->proclocal_nr_pages; BUG_ON(!nr_pages); mm->proclocal_nr_pages -= nr_pages; - /* subsequent patch will unmap and release physical pages */ + + for (i = 0; i < nr_pages; i++) + proclocal_unmap_page_notlbflush(mm, kvaddr + i * PAGE_SIZE); + up_write(&mm->mmap_sem); + /* + * Flush process-local mappings from TLBs so that we can release the + * pages afterwards. + */ + flush_tlb_mm_range(mm, (unsigned long)kvaddr, + (unsigned long)kvaddr + nr_pages * PAGE_SIZE, + PAGE_SHIFT, false); + + proclocal_release_pages_incl_head(&first_page->proclocal_next); + free_virtual(kvaddr, nr_pages + 1); } EXPORT_SYMBOL(kfree_proclocal); From patchwork Wed Jun 12 17:08:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 65F6913AF for ; Wed, 12 Jun 2019 17:11:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4AC0028841 for ; Wed, 12 Jun 2019 17:11:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3BF9D28A17; Wed, 12 Jun 2019 17:11:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BC2828841 for ; Wed, 12 Jun 2019 17:11:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 862CD6B026C; Wed, 12 Jun 2019 13:11:40 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 813386B026D; Wed, 12 Jun 2019 13:11:40 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6DAAB6B026E; Wed, 12 Jun 2019 13:11:40 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 4B02E6B026C for ; Wed, 12 Jun 2019 13:11:40 -0400 (EDT) Received: by mail-qt1-f198.google.com with SMTP id z16so15175527qto.10 for ; Wed, 12 Jun 2019 10:11:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=mHyM8SWHlAfs7A/FW6SKOCGNM1rYZjxh4Tkjo7ZhY7U=; b=WHs4NJLyK+221pRnef4OAQP4T1kDXtDAwTUpefjJpc/DYcXMka9whcPeSdFIti4lFU jFzILf5BwHpqtmSIhdynM4mkAA26ZhMGJ8uPB0ZxcDyRwuaVVTiC50SnMHYFbXmjs8GE x2AWVhB32Nb8sqdzfVS3+zdykdWMPdu1DDWma55Ib+RJqr+lIo7adgRYQvg50jyK4OMf z05rRtYgA6aYjc2rr2Fsw1h6J4GbRjArhO3rofkioRXgc8HDn6DMWow357GcJjklGNeE u1ZmxN5sl8wW3Jz7gzObMscCwwMSokm5+HXoJHtJYGuLp9kYh0NtSdpWGcrjw29EFh9K cDVQ== X-Gm-Message-State: APjAAAUY/cQ6CMT9BEJyhe219K/Wkd8Pxu7EOEVQor0itvl0w2KUUNrR B89+hTFqqF/U3Cokerv09GwQ4hvf+Hr0bZYiGqAgjRyWEWSX8TLjDs5OEOemtzKLqRvh2X8W3QT 0A+pIN6kG2m/YHVOHEZbHxq0AvbuNtKVBxt41m2ZU3vS1ovwNaIrt2aVa0Elgn5d1fg== X-Received: by 2002:ac8:4601:: with SMTP id p1mr71942330qtn.181.1560359499965; Wed, 12 Jun 2019 10:11:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqzv8YgQFGAlIosKzE6Y8gZylL1e3UeYonjtrvG/PB0em2fOPU2QBOQx+NI6HRX4Zu7uE4Dr X-Received: by 2002:ac8:4601:: with SMTP id p1mr71942241qtn.181.1560359498360; Wed, 12 Jun 2019 10:11:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359498; cv=none; d=google.com; s=arc-20160816; b=MjUfRcYqHSfzKhxuvJpKj0Th7w+vcm9PrcQnx2S9qrPfoizJNIjiOL96eRxZq48VaM mLSY+ixAjgCB/Y/FvDpxk23+cm3s4Q299MfnvZMbgrP9IdwszG8/5Tk9EEowtpSLrTfX 2GQ782byuQwMahISN7MTjwsvZo2rQMI/Ol02s1F+US95ifEFV8sjUOXUgYs0DeScHvl9 igxHbkmkpar2h/tZDTfHLxaCEHmyksLD/I/tz503Kbf0hxCHRy6Jtflo6Q5hPoKvxlBk S1+h3kcuD7fTEOK4JtUBjXU9g1S+Vgpci91+QZLbGEITzJxOs0Kw0I2qqJAY5cQQ5Ty8 TGNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=mHyM8SWHlAfs7A/FW6SKOCGNM1rYZjxh4Tkjo7ZhY7U=; b=Tb539RKvTfVyVfFT5j7ZsCddGo6Y6tV8IO/Q7v0gDGgkXNJT0z8lFOkX4PnlwacuMG 7uDw2yBItw7pCobh5uUB/3XY8FGYezSOOU66nVi9nWusxKIU+H0/PnjEam4tr+jYgH+F /b0nwC4LcqPWD6AYVFhKSLi7o3EV4IjWqFQ/e+6iEQLDzWJX2MjUqjheLWVMrvxFmXG9 gHyt6CnwvHCyEEB3v6ZntkW6XZVBlRRFEsBSiFrNvMX6YjmWzuGguVECtN/CvDmkBgvX U7XLVfcKUONem6aGfDu4uRKx5LUNXWcYkRK9+miaeYRXktmyRPCaXJKVVUAZxsHO6Gyl 95Ng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=uegwbgsJ; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 52.95.49.90 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com. [52.95.49.90]) by mx.google.com with ESMTPS id q54si312674qtf.138.2019.06.12.10.11.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:11:38 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 52.95.49.90 as permitted sender) client-ip=52.95.49.90; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=uegwbgsJ; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 52.95.49.90 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359498; x=1591895498; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mHyM8SWHlAfs7A/FW6SKOCGNM1rYZjxh4Tkjo7ZhY7U=; b=uegwbgsJzOC7Sam9PEZ8KhrDVtTbb/25iUP9MY2fmSXMCc7j4Wrycr6q MIMxfL2p0wx4XYL1iZoOTtvVrtB7PhFuxm+prJaycTE3MFiIvwLduU95u ecacg5PhNY3gQT2IGEBKqdBcMvvqKPAfKNDbDU9SWN+tjC1bpCANSx2P9 o=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="406138085" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2b-4ff6265a.us-west-2.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP; 12 Jun 2019 17:11:30 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan2.amazon.com [10.247.140.66]) by email-inbound-relay-2b-4ff6265a.us-west-2.amazon.com (Postfix) with ESMTPS id 57BB3A1956; Wed, 12 Jun 2019 17:11:29 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHBRZt018256; Wed, 12 Jun 2019 19:11:27 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHBQ5g018255; Wed, 12 Jun 2019 19:11:26 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse Subject: [RFC 06/10] kvm/x86: add support for storing vCPU state in process-local memory Date: Wed, 12 Jun 2019 19:08:36 +0200 Message-Id: <20190612170834.14855-7-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The hidden KVM state will both contain guest state that is specific to x86-64 as well as state specific to SVM or VMX, respectively. Thus, allocate the hidden state in the code paths specific to SVM and VMX. For the code that is shared between SVM and VMX, introduce a common struct for hidden guest state. Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/include/asm/kvm_host.h | 9 ++++++++ arch/x86/kvm/Kconfig | 10 +++++++++ arch/x86/kvm/svm.c | 37 +++++++++++++++++++++++++++++++- arch/x86/kvm/vmx.c | 38 ++++++++++++++++++++++++++++++++- arch/x86/kvm/x86.c | 5 +++++ 5 files changed, 97 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5772fba1c64e..41c7b06588f9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -534,7 +534,16 @@ struct kvm_vcpu_hv { cpumask_t tlb_flush; }; +#ifdef CONFIG_KVM_PROCLOCAL +struct kvm_vcpu_arch_hidden { + u64 placeholder; +}; +#endif + struct kvm_vcpu_arch { +#ifdef CONFIG_KVM_PROCLOCAL + struct kvm_vcpu_arch_hidden *hidden; +#endif /* * rip and regs accesses must go through * kvm_{register,rip}_{read,write} functions. diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 80abc68b3e90..a3640e2f1a32 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -97,6 +97,16 @@ config KVM_MMU_AUDIT This option adds a R/W kVM module parameter 'mmu_audit', which allows auditing of KVM MMU events at runtime. +config KVM_PROCLOCAL + bool "Use process-local allocation for KVM" + depends on KVM && PROCLOCAL + ---help--- + Use process-local memory for storing vCPU state in KVM. This + option removes assets from the kernel's global direct mapping + of physical memory and stores them only in the address space + of the process hosting a VM. + + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/vhost/Kconfig diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index cb202c238de2..af66b93902e5 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -190,9 +191,20 @@ static u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly; */ static uint64_t osvw_len = 4, osvw_status; +#ifdef CONFIG_KVM_PROCLOCAL +struct vcpu_svm_hidden { + struct { /* mimic topology in vcpu_svm: */ + struct kvm_vcpu_arch_hidden arch; + } vcpu; +}; +#endif + struct vcpu_svm { struct kvm_vcpu vcpu; struct vmcb *vmcb; +#ifdef CONFIG_KVM_PROCLOCAL + struct vcpu_svm_hidden *hidden; +#endif unsigned long vmcb_pa; struct svm_cpu_data *svm_data; uint64_t asid_generation; @@ -2129,9 +2141,18 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) goto out; } +#ifdef CONFIG_KVM_PROCLOCAL + svm->hidden = kzalloc_proclocal(sizeof(struct vcpu_svm_hidden)); + if (!svm->hidden) { + err = -ENOMEM; + goto free_svm; + } + svm->vcpu.arch.hidden = &svm->hidden->vcpu.arch; +#endif + err = kvm_vcpu_init(&svm->vcpu, kvm, id); if (err) - goto free_svm; + goto free_hidden; err = -ENOMEM; page = alloc_page(GFP_KERNEL); @@ -2187,7 +2208,11 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) __free_page(page); uninit: kvm_vcpu_uninit(&svm->vcpu); +free_hidden: +#ifdef CONFIG_KVM_PROCLOCAL + kfree_proclocal(svm->hidden); free_svm: +#endif kmem_cache_free(kvm_vcpu_cache, svm); out: return ERR_PTR(err); @@ -2205,6 +2230,16 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); +#ifdef CONFIG_KVM_PROCLOCAL + /* + * note that the hidden vCPU state in a process-local allocation is + * already cleaned up, because a process's mm is torn down before files + * are closed. make any access in the cleanup code very visible. + */ + svm->hidden = (struct vcpu_svm_hidden *)POISON_POINTER_DELTA; + svm->vcpu.arch.hidden = (struct kvm_vcpu_arch_hidden *)POISON_POINTER_DELTA; +#endif + /* * The vmcb page can be recycled, causing a false negative in * svm_vcpu_load(). So, ensure that no logical CPU has this diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f9a4faf2d1bc..6f59a6ad7835 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -37,6 +37,8 @@ #include #include #include +#include + #include "kvm_cache_regs.h" #include "x86.h" @@ -975,8 +977,19 @@ struct vmx_msrs { struct vmx_msr_entry val[NR_AUTOLOAD_MSRS]; }; +#ifdef CONFIG_KVM_PROCLOCAL +struct vcpu_vmx_hidden { + struct { /* mimic topology in vcpu_svm: */ + struct kvm_vcpu_arch_hidden arch; + } vcpu; +}; +#endif + struct vcpu_vmx { struct kvm_vcpu vcpu; +#ifdef CONFIG_KVM_PROCLOCAL + struct vcpu_vmx_hidden *hidden; +#endif unsigned long host_rsp; u8 fail; u8 msr_bitmap_mode; @@ -11756,6 +11769,16 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); +#ifdef CONFIG_KVM_PROCLOCAL + /* + * note that the hidden vCPU state in a process-local allocation is + * already cleaned up, because a process's mm is torn down before files + * are closed. make any access in the cleanup code very visible. + */ + vmx->hidden = (struct vcpu_vmx_hidden *)POISON_POINTER_DELTA; + vmx->vcpu.arch.hidden = (struct kvm_vcpu_arch_hidden *)POISON_POINTER_DELTA; +#endif + if (enable_pml) vmx_destroy_pml_buffer(vmx); free_vpid(vmx->vpid); @@ -11777,11 +11800,20 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) if (!vmx) return ERR_PTR(-ENOMEM); +#ifdef CONFIG_KVM_PROCLOCAL + vmx->hidden = kzalloc_proclocal(sizeof(struct vcpu_vmx_hidden)); + if (!vmx->hidden) { + err = -ENOMEM; + goto free_vcpu; + } + vmx->vcpu.arch.hidden = &vmx->hidden->vcpu.arch; +#endif + vmx->vpid = allocate_vpid(); err = kvm_vcpu_init(&vmx->vcpu, kvm, id); if (err) - goto free_vcpu; + goto free_hidden; err = -ENOMEM; @@ -11868,7 +11900,11 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) vmx_destroy_pml_buffer(vmx); uninit_vcpu: kvm_vcpu_uninit(&vmx->vcpu); +free_hidden: +#ifdef CONFIG_KVM_PROCLOCAL + kfree_proclocal(vmx->hidden); free_vcpu: +#endif free_vpid(vmx->vpid); kmem_cache_free(kvm_vcpu_cache, vmx); return ERR_PTR(err); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 371d98422631..2cfb96ca8cc8 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9309,6 +9309,11 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) free_page((unsigned long)vcpu->arch.pio_data); if (!lapic_in_kernel(vcpu)) static_key_slow_dec(&kvm_no_apic_vcpu); + /* + * note that the hidden vCPU state in a process-local allocation is + * already cleaned up at this point, because a process's mm is torn down + * before files are closed. + */ } void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) From patchwork Wed Jun 12 17:08:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990441 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DE54B1395 for ; Wed, 12 Jun 2019 17:11:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C48A028841 for ; Wed, 12 Jun 2019 17:11:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B8E3B28A17; Wed, 12 Jun 2019 17:11:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 32FF528841 for ; Wed, 12 Jun 2019 17:11:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 555416B026D; Wed, 12 Jun 2019 13:11:47 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5062D6B026E; Wed, 12 Jun 2019 13:11:47 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F4786B026F; Wed, 12 Jun 2019 13:11:47 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by kanga.kvack.org (Postfix) with ESMTP id 1FA366B026D for ; Wed, 12 Jun 2019 13:11:47 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id d62so3772914qke.21 for ; Wed, 12 Jun 2019 10:11:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=BCwbtGWRrYM8rpdyPautZ9rurvQMH3lCItd7cCuySww=; b=b3t2riqvDkvuwcsL7tQHhiyxs73ZcSp9oPPGSKshKNIlwMp9d411UbGMy088cYjPq/ Q98fcUHSUs00cvVFSmXwV5yVDLg9JMaN3WFI/cNeLeuItyYidDqvETUnVeLs72SHjBxf bsrEUYR21533qtQ8CN02tZnZQRnfMJMHGyCyad+FqhIFfP0WppZpeOdj4WH6kIHBg5sp hZTm97sM9dL3Fcxm1yifWM6TkRuyT7j3F3zOpVwJ2po4IM3PjN03NBdO8co2qdeS/nI2 lC2201nrtqoEFm6ihRNUlP7gphcngHM/ADrdIMTZHMlmRpvkQ2BiU0ntJgf1+OUYM6iY IeyA== X-Gm-Message-State: APjAAAUEZ/YJAlO4hvYNW1RtPDOmMADZW26Okr6bPKsVqqCYYkGW1E9/ s3rQxz5efPAh3h21acxaR86t8LcQCm7pNRPpfFNT0WEKDO1AVT+H3PidrTcb10h7QfMFZwzrUpf CfKjXS4pxvYpiOp2EFViPRBtbnyawIS3/uKa6TzSauyU1NDHpQ7JWyaW7XKP//T7+qg== X-Received: by 2002:a05:620a:5ad:: with SMTP id q13mr13239917qkq.154.1560359506831; Wed, 12 Jun 2019 10:11:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqx0vIDFB6jbIQpWEQeo78wXKshLUJKzXVi+td69ulq2HLWvCjBmP/YN5oJgoWe3xY3XIL67 X-Received: by 2002:a05:620a:5ad:: with SMTP id q13mr13239880qkq.154.1560359506198; Wed, 12 Jun 2019 10:11:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359506; cv=none; d=google.com; s=arc-20160816; b=UgnYunk3R0lj5+SkJ6qSCoWjydLqd2VTpSEFUUIxtfvJhkamjioYDeJzOKQXK+enYG PgACjDJPqVRSadMV26KdIUeUN52uNrHPs7ZDQCjPyMuDyWA7/7hwt1EsBhT0TkSOR/Zm Av14SWWmmwOV9iNmRYvs+n/2UqR8xmN7GW7uvafRrxPxqeODeWTHgzLF7R3JmmM0x54h RDbyRbTlQjYKs+iDoS3FjTfk7tEoWInC/8W+kGbLebtC06ibQgqVafVixf9fzsKkCIFq 7R98D5AX1BKN/t3gFEKfrSnUaduSRqNVADt2c15SFNhFffLan83PUMUIj8ztBazycD/U ZnoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=BCwbtGWRrYM8rpdyPautZ9rurvQMH3lCItd7cCuySww=; b=0FQvWq4OI93NSrchkOVNBpjX3c6i9yA9XSVByOeHLt+Zwe+u0Wqq3tkvUFEwKUvq2R jHRNbkL25FJsw69UDeRytBg/YsNeMWm4/oYC6f72hS2FUXU05bGwatKBkalJhFHoA5Au jY7XVni7jNP9QZBx+shdqrjSJPPRDoHnxwOjb3IwcYjghSXW7+YQZ6QPYtUEkrvdUYCp S2KD8xxZQvFdR6rc9w+t2IHnGluz1SMumIczAG68pVZSCiw5IgaZYSIlgiAzve4hVjy0 J62fRpp8Zsoz2B843RNrZGhC/YWUBxE5P+Q6Zfenk6SeS0qqIVLPSzGi8RjQGChIovLg NROg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=ZpUE5n3n; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 72.21.198.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-4101.amazon.com (smtp-fw-4101.amazon.com. [72.21.198.25]) by mx.google.com with ESMTPS id n13si314695qtn.125.2019.06.12.10.11.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:11:46 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 72.21.198.25 as permitted sender) client-ip=72.21.198.25; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=ZpUE5n3n; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 72.21.198.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359506; x=1591895506; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BCwbtGWRrYM8rpdyPautZ9rurvQMH3lCItd7cCuySww=; b=ZpUE5n3n9rssRyVx4xgubcPZn8v32YgoRu+51EvPH4HMG/TYGqvq6Pr8 8jPSQz+j5cL8ty6MRVHWKMz8E424fqQv9jyZh95q18xhw7hkLlyjGeLaJ e1LuCmIga9v0FUFhcdmBNNWMegxATh6bEw8gbENE6zYQUHF6hwsyi7n8G 8=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="770066910" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-87a10be6.us-west-2.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP; 12 Jun 2019 17:11:44 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2c-87a10be6.us-west-2.amazon.com (Postfix) with ESMTPS id 4A3B9A2424; Wed, 12 Jun 2019 17:11:43 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHBftS018553; Wed, 12 Jun 2019 19:11:41 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHBexi018552; Wed, 12 Jun 2019 19:11:40 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse , Julian Stecklina Subject: [RFC 07/10] kvm, vmx: move CR2 context switch out of assembly path Date: Wed, 12 Jun 2019 19:08:38 +0200 Message-Id: <20190612170834.14855-8-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina The VM entry/exit path is a giant inline assembly statement. Simplify it by doing CR2 context switching in plain C. Move CR2 restore behind IBRS clearing, so we reduce the amount of code we execute with IBRS on. Using {read,write}_cr2() means KVM will use pv_mmu_ops instead of open coding native_{read,write}_cr2(). The CR2 code has been done in assembly since KVM's genesis[1], which predates the addition of the paravirt ops[2], i.e. KVM isn't deliberately avoiding the paravirt ops. [1] Commit 6aa8b732ca01 ("[PATCH] kvm: userspace interface") [2] Commit d3561b7fa0fb ("[PATCH] paravirt: header and stubs for paravirtualisation") Signed-off-by: Julian Stecklina [rebased; note that this patch mainly improves the readability of subsequent patches; we will drop it when rebasing to 5.x, since major refactoring of KVM makes this patch redundant.] Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/kvm/vmx.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 6f59a6ad7835..16a383635b59 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11513,6 +11513,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) evmcs_rsp = static_branch_unlikely(&enable_evmcs) ? (unsigned long)¤t_evmcs->host_rsp : 0; + if (read_cr2() != vcpu->arch.cr2) + write_cr2(vcpu->arch.cr2); + if (static_branch_unlikely(&vmx_l1d_should_flush)) vmx_l1d_flush(vcpu); @@ -11532,13 +11535,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "2: \n\t" __ex("vmwrite %%" _ASM_SP ", %%" _ASM_DX) "\n\t" "1: \n\t" - /* Reload cr2 if changed */ - "mov %c[cr2](%0), %%" _ASM_AX " \n\t" - "mov %%cr2, %%" _ASM_DX " \n\t" - "cmp %%" _ASM_AX ", %%" _ASM_DX " \n\t" - "je 3f \n\t" - "mov %%" _ASM_AX", %%cr2 \n\t" - "3: \n\t" /* Check if vmlaunch of vmresume is needed */ "cmpl $0, %c[launched](%0) \n\t" /* Load guest registers. Don't clobber flags. */ @@ -11599,8 +11595,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "xor %%r14d, %%r14d \n\t" "xor %%r15d, %%r15d \n\t" #endif - "mov %%cr2, %%" _ASM_AX " \n\t" - "mov %%" _ASM_AX ", %c[cr2](%0) \n\t" "xor %%eax, %%eax \n\t" "xor %%ebx, %%ebx \n\t" @@ -11632,7 +11626,6 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) [r14]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R14])), [r15]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R15])), #endif - [cr2]"i"(offsetof(struct vcpu_vmx, vcpu.arch.cr2)), [wordsize]"i"(sizeof(ulong)) : "cc", "memory" #ifdef CONFIG_X86_64 @@ -11666,6 +11659,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) /* Eliminate branch target predictions from guest mode */ vmexit_fill_RSB(); + vcpu->arch.cr2 = read_cr2(); + /* All fields are clean at this point */ if (static_branch_unlikely(&enable_evmcs)) current_evmcs->hv_clean_fields |= From patchwork Wed Jun 12 17:08:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990445 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B448B13AF for ; Wed, 12 Jun 2019 17:12:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9768328841 for ; Wed, 12 Jun 2019 17:12:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8B00C28A28; Wed, 12 Jun 2019 17:12:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10E8928841 for ; Wed, 12 Jun 2019 17:12:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2BE056B026E; Wed, 12 Jun 2019 13:11:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 26D796B026F; Wed, 12 Jun 2019 13:11:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1838C6B0270; Wed, 12 Jun 2019 13:11:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-vs1-f70.google.com (mail-vs1-f70.google.com [209.85.217.70]) by kanga.kvack.org (Postfix) with ESMTP id E8DD36B026E for ; Wed, 12 Jun 2019 13:11:58 -0400 (EDT) Received: by mail-vs1-f70.google.com with SMTP id w76so5601050vsw.10 for ; Wed, 12 Jun 2019 10:11:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=y0y2fCtE+zaOH2XT3643VHxhFHXbECws0ZTrhMzBQQU=; b=IrT4Y6YfJIKU9LcY2lgMgu46pscnM5vdvmtBLlQc+tB1K7AjBOvnjnJiFVwi/E2dxf Ojzwjs54x6C9T2TRFMGX3L+p9ys7eHaoTTdsehuCeuRv4bAp1RVv6CwPc4a0jidBh0hK bYjIf3C8VXuOYFi66VTP30UQ3dGgKpDCMJUwBlQ0xBT6PCLXeju0bg15hCE4Y5ZlmQCJ jNr6j5b6DEmGBgxPPfGkLgeOVk+KPMFWCU7XTCtfQSJRl1hXFZy7sNWiueJdKe2jk0+r OxCNdw2Pa93IcRqfivS98yif0tnhRvNHT11w+RqMxeEzC8PnfrpejF+uEtodrt3QQwse gMRg== X-Gm-Message-State: APjAAAV5znxvXooURs4AOGMakKa23sa0/UAKOHasugzPaYzzt60AEkHJ 9A9Lw2talbZ9lL0MDhVzXjKcRyZV4p8ilTcHD/eMu34/cMcX0aMRgN2Da9sCfhW1e7tYdmheO2/ 5K6oTCzRdzYDSPNNPwI5279bnQanDmByp2zEMUMM4YWU3ZLEP9lFkBMIiLR11OR3k2Q== X-Received: by 2002:ab0:4307:: with SMTP id k7mr23734894uak.45.1560359518677; Wed, 12 Jun 2019 10:11:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqxjwmNKAk7XF8Be95HFGX2v6OT4RV3ImObeSB3l/9Ux53bX7WK02WR3Pj4Y7YqJK7zVQ+3+ X-Received: by 2002:ab0:4307:: with SMTP id k7mr23734826uak.45.1560359518023; Wed, 12 Jun 2019 10:11:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359518; cv=none; d=google.com; s=arc-20160816; b=UfdmCpPi4iEbTDhwocQWiGYDMrMOEfPp5PeUHPgze4WZaqBH+TuVDAV+Abn1dWylP9 owFws4rULwScFepiw9SLey0NneFqE2ilNJaroxhOJDfGqI7MoSXTidaZ4FBFZlD328Z3 u8NMQge5pJqfb3T56/VY4mO4cV7qBk8WNPF+PcFNq0Y/gOqLDFqWbTHX7PEUGDWZ617o IWE4Om3roPHzy+5mqYZSqQef3y3SL4Qi5QQMy7LZm9s5/qPtOS1+Mwc8ZUejAF/AueUO wfsgyu/6SD0T72jDrN7tOg826J7cUzoAECJDpBaWL9Mh1yVS/R0RyjQN+h5UqNmXDBsv aYow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=y0y2fCtE+zaOH2XT3643VHxhFHXbECws0ZTrhMzBQQU=; b=DiG7prOfwBLI5BrmhELaxHqXWRskP2jlcf1IfZzGQq07CYesJHMAwaSMHcwpZAL0zE GUsa8FtcBSW0PWn/4bjfgI/pT6bO7kVIlPg3E9sNQaYo+6F+I6RMzncOMqBafZFG6Kwe i/04WhjykoglT/n4VAIlcxXVfsQqBs9pbOZNfUB8/6wXCz/ijXjZwbFsZ5wFz8QPxbgE ysfzhFZriYrqQ5yGQnUiWYnKiH/fVKUtU+Sr+85fnEDiOBvmzuSm+VZUdIPWGkjKUEV6 vGM2Jaq3OWk6BHlg3970s0R97fvrHu/5pEBFGYItmdWdMPT2ap7N+RX068KJYJBlOFbX BEFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b="T1+M/fEJ"; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com. [207.171.190.10]) by mx.google.com with ESMTPS id s188si112206vkg.15.2019.06.12.10.11.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:11:58 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) client-ip=207.171.190.10; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b="T1+M/fEJ"; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359517; x=1591895517; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y0y2fCtE+zaOH2XT3643VHxhFHXbECws0ZTrhMzBQQU=; b=T1+M/fEJAofcbX7d4d2xJiXxoxc2bfBBTqATNmcJiX0+868WyIzqcaFX vfkzJ3/fHzIrF9ZBoa+N2KCJZ6rG9MEFonNJLfkEFYFwXWUC1ikCWar+T etfw52zh5mRgCh42rihGWFPb9bAMeEN0jVCT9J3ohqw6tVMTKb5hcZ7Mh w=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="805048839" Received: from sea3-co-svc-lb6-vlan2.sea.amazon.com (HELO email-inbound-relay-2b-c7131dcf.us-west-2.amazon.com) ([10.47.22.34]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 12 Jun 2019 17:11:56 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2b-c7131dcf.us-west-2.amazon.com (Postfix) with ESMTPS id 45AE6A256D; Wed, 12 Jun 2019 17:11:56 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHBsGK018632; Wed, 12 Jun 2019 19:11:54 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHBr1J018630; Wed, 12 Jun 2019 19:11:53 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse , Julian Stecklina Subject: [RFC 08/10] kvm, vmx: move register clearing out of assembly path Date: Wed, 12 Jun 2019 19:08:40 +0200 Message-Id: <20190612170834.14855-9-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Julian Stecklina Split the security related register clearing out of the large inline assembly VM entry path. This results in two slightly less complicated inline assembly statements, where it is clearer what each one does. Signed-off-by: Julian Stecklina [rebased to 4.20; note that the purpose of this patch is to make the changes in the next commit more readable. we will drop this patch when rebasing to 5.x, since major refactoring of KVM makes it redundant.] Signed-off-by: Marius Hillenbrand Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/kvm/vmx.c | 46 +++++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 16a383635b59..0fe9a4ab8268 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11582,24 +11582,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) "mov %%r13, %c[r13](%0) \n\t" "mov %%r14, %c[r14](%0) \n\t" "mov %%r15, %c[r15](%0) \n\t" - /* - * Clear host registers marked as clobbered to prevent - * speculative use. - */ - "xor %%r8d, %%r8d \n\t" - "xor %%r9d, %%r9d \n\t" - "xor %%r10d, %%r10d \n\t" - "xor %%r11d, %%r11d \n\t" - "xor %%r12d, %%r12d \n\t" - "xor %%r13d, %%r13d \n\t" - "xor %%r14d, %%r14d \n\t" - "xor %%r15d, %%r15d \n\t" #endif - - "xor %%eax, %%eax \n\t" - "xor %%ebx, %%ebx \n\t" - "xor %%esi, %%esi \n\t" - "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP "; pop %%" _ASM_DX " \n\t" ".pushsection .rodata \n\t" ".global vmx_return \n\t" @@ -11636,6 +11619,35 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) #endif ); + /* + * Explicitly clear (in addition to marking them as clobbered) all GPRs + * that have not been loaded with host state to prevent speculatively + * using the guest's values. + */ + asm volatile ( + "xor %%eax, %%eax \n\t" + "xor %%ebx, %%ebx \n\t" + "xor %%esi, %%esi \n\t" + "xor %%edi, %%edi \n\t" +#ifdef CONFIG_X86_64 + "xor %%r8d, %%r8d \n\t" + "xor %%r9d, %%r9d \n\t" + "xor %%r10d, %%r10d \n\t" + "xor %%r11d, %%r11d \n\t" + "xor %%r12d, %%r12d \n\t" + "xor %%r13d, %%r13d \n\t" + "xor %%r14d, %%r14d \n\t" + "xor %%r15d, %%r15d \n\t" +#endif + ::: "cc" +#ifdef CONFIG_X86_64 + , "rax", "rbx", "rsi", "rdi" + , "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15" +#else + , "eax", "ebx", "esi", "edi" +#endif + ); + /* * We do not use IBRS in the kernel. If this vCPU has used the * SPEC_CTRL MSR it may have left it on; save the value and From patchwork Wed Jun 12 17:08:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990449 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6061713AF for ; Wed, 12 Jun 2019 17:12:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 43E6628841 for ; Wed, 12 Jun 2019 17:12:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 378B228A17; Wed, 12 Jun 2019 17:12:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CC03428841 for ; Wed, 12 Jun 2019 17:12:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE0D86B0269; Wed, 12 Jun 2019 13:12:18 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D90AA6B026A; Wed, 12 Jun 2019 13:12:18 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C59A36B026F; Wed, 12 Jun 2019 13:12:18 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ua1-f70.google.com (mail-ua1-f70.google.com [209.85.222.70]) by kanga.kvack.org (Postfix) with ESMTP id 9AFB96B0269 for ; Wed, 12 Jun 2019 13:12:18 -0400 (EDT) Received: by mail-ua1-f70.google.com with SMTP id b47so1463485uad.3 for ; Wed, 12 Jun 2019 10:12:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=nu+2j4+opOk0AuGdSc7kENHXvBAWinXIo7DvC8lL1C0=; b=H0yWUyBtrxZTo77yOExAP8NFq0lT+Z7+t+cIRDz8q506OAVd0eGHA6gTvY0MHNM496 PA2o/U5G1dlD9ms1QYuxZZum12w1rqiTRmZEOIuvKtTajA+/TI+iHVmyRBMCEKaD202b RTPYi5dVSbshLB3RM3Z4V1hbrCpNOSNuwwiCfu35wjnhMTjHW2M7ujgzGn0xVE0rPkJ+ 0GZxwCVWoUJjVCxg2tinxqnqPnweG+jUclU3DvaD1LZuYS3M3vXm/PAR2tE41OkKe+Kb a0uCRP1s49gCW8mGbL1k8RfndLDe/6HkgdCZaI5/FqpJT+euuwKg8D03hz/TzHEz59U7 SEZg== X-Gm-Message-State: APjAAAVomEjEjPxTCsSTCMpM+gpicPD07bMF/7Qq/PtXCrzlUwIr84S7 vCqSkHwY6jkc1yPDVsyzlqjRGr3WJO1IIH+6S6s0sF0hSayexn8/upn+hw4nLWF4T1DF1zG7tAY QBP1d29+Qqn81T+vT8FI7JxtsjvQMT4rk4Fj6i1cuLoh/hAQ/n8aiqWYUtt48Vleurg== X-Received: by 2002:ab0:5c8:: with SMTP id e66mr15418284uae.10.1560359538291; Wed, 12 Jun 2019 10:12:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqzn4o2EuO4yVz6PbAHB6L70MvCxBw8G/qYvzKXmu33tIRfqOntXtF1TCajJO2DNBfU9nFRe X-Received: by 2002:ab0:5c8:: with SMTP id e66mr15418148uae.10.1560359537105; Wed, 12 Jun 2019 10:12:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359537; cv=none; d=google.com; s=arc-20160816; b=Z8LiZwGReUxEM6CgYM8LYhZxuyhX94bV+TdO6vcvCndHrnfL/4rQG7Y3jRjoQNbLxB 9OtspZDcOgs5MemJkVHXv2WcLwHRxYLC/MugMtg2RUnA4r1VrZTwzxSB4UlPUDMUM5Wa Lt82tCNCT+0NjfnoPLrLPPu/utjUfiET3JsYpCs9cACMmOjjuJapsrTrVtaZ8nFoq8L4 l+iID4/3mrO1XN9nY20dNZ0QC1Gg3uh2KZ1hFVhIyYcsgJFyuzwktLyxURSy70wOJr77 7KA3Bgr0gctWlWlegyNYtOiSJVslLMt41Wqv3jBDpNCDgNZ5rxOp3udBLUM+xeCbfZf2 /LgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=nu+2j4+opOk0AuGdSc7kENHXvBAWinXIo7DvC8lL1C0=; b=CRkcrHwA3fFpo2aM9ePf/Di+LgxBWYGRUT/dQLnPnj2jJ2O0BYyPserFyL7OWvIgY/ FDBRd7F9WxOsI8nTpGC3jeowwoUFBMkzoQv+Tkmf6ftInveqX6AqyMdSukWMCfACfvIY 1pu8O3Yc+n/R2oBCenoXxmp9eKD11jvuQed5zmyVDY1cdTR5FGWu+eRI8iNmDRTNdX8d Yo5O6kERbhkmQ8KuQY9GGhBz4jI8TxRHMRSbQzpwo7TJFS2Jywvpz3TOndxGNQtoqY33 GTHR8eXtOSQFfJS1mqx0tTlsoXkcaJRELWeAYhaEIf8cTLbZ/VKxCLN/6nEZk04riGoX Oqyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=GgjrjYR2; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-9101.amazon.com (smtp-fw-9101.amazon.com. [207.171.184.25]) by mx.google.com with ESMTPS id h8si77944vsm.351.2019.06.12.10.12.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:12:17 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) client-ip=207.171.184.25; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=GgjrjYR2; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.184.25 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359536; x=1591895536; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nu+2j4+opOk0AuGdSc7kENHXvBAWinXIo7DvC8lL1C0=; b=GgjrjYR2Fhg4rIViXM04lQ8U+jtAtJ4zfL1frHd5h6uDaq+WJOROx/75 2JjKdfPbb7UahvzcnCa9jjElTvNR7RzWT6c+s9Q/GC3YIlB06C3DZ3DnH dkVn95bv6dxgDsv2kOFETo/0vLwb3mMte3meAQ+BAQhyKdlaSP08ZVzGw k=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="810039218" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-2a-538b0bfb.us-west-2.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP; 12 Jun 2019 17:12:15 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (pdx2-ws-svc-lb17-vlan2.amazon.com [10.247.140.66]) by email-inbound-relay-2a-538b0bfb.us-west-2.amazon.com (Postfix) with ESMTPS id 9D794A1892; Wed, 12 Jun 2019 17:12:14 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHCCn6018873; Wed, 12 Jun 2019 19:12:12 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHCCfn018865; Wed, 12 Jun 2019 19:12:12 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse , Julian Stecklina Subject: [RFC 09/10] kvm, vmx: move gprs to process local memory Date: Wed, 12 Jun 2019 19:08:42 +0200 Message-Id: <20190612170834.14855-10-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP General-purpose registers (GPRs) contain guest data and must be protected from information leak vulnerabilities in the kernel. Move GPRs into process local memory and change the VMX and SVM world switch and related code accordingly. The VMX and SVM world switch are giant inline assembly code blocks. To keep the changes minimal, we pass all required state as a pointer to a single struct in process-local memory, which is hidden from other processes. Note that this feature is strictly opt-in. When disabled, the world switch code remains unchanged. Signed-off-by: Marius Hillenbrand Inspired-by: Julian Stecklina (while jsteckli@amazon.de) Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/include/asm/kvm_host.h | 27 +++++++- arch/x86/kvm/kvm_cache_regs.h | 4 +- arch/x86/kvm/svm.c | 67 +++++++++++-------- arch/x86/kvm/vmx.c | 114 +++++++++++++++++++++----------- arch/x86/kvm/x86.c | 2 +- 5 files changed, 143 insertions(+), 71 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 41c7b06588f9..4896ecde1c11 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -534,21 +534,30 @@ struct kvm_vcpu_hv { cpumask_t tlb_flush; }; +typedef unsigned long kvm_arch_regs_t[NR_VCPU_REGS]; #ifdef CONFIG_KVM_PROCLOCAL +/* + * access to vcpu guest state must go through kvm_vcpu_arch_state(vcpu). + */ struct kvm_vcpu_arch_hidden { - u64 placeholder; + /* + * rip and regs accesses must go through + * kvm_{register,rip}_{read,write} functions. + */ + kvm_arch_regs_t regs; }; #endif struct kvm_vcpu_arch { #ifdef CONFIG_KVM_PROCLOCAL struct kvm_vcpu_arch_hidden *hidden; -#endif +#else /* * rip and regs accesses must go through * kvm_{register,rip}_{read,write} functions. */ - unsigned long regs[NR_VCPU_REGS]; + kvm_arch_regs_t regs; +#endif u32 regs_avail; u32 regs_dirty; @@ -791,6 +800,18 @@ struct kvm_vcpu_arch { bool l1tf_flush_l1d; }; +#ifdef CONFIG_KVM_PROCLOCAL +static inline struct kvm_vcpu_arch_hidden *kvm_vcpu_arch_state(struct kvm_vcpu_arch *arch) +{ + return arch->hidden; +} +#else +static inline struct kvm_vcpu_arch *kvm_vcpu_arch_state(struct kvm_vcpu_arch *vcpu_arch) +{ + return vcpu_arch; +} +#endif + struct kvm_lpage_info { int disallow_lpage; }; diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h index 9619dcc2b325..b62c42d637e0 100644 --- a/arch/x86/kvm/kvm_cache_regs.h +++ b/arch/x86/kvm/kvm_cache_regs.h @@ -13,14 +13,14 @@ static inline unsigned long kvm_register_read(struct kvm_vcpu *vcpu, if (!test_bit(reg, (unsigned long *)&vcpu->arch.regs_avail)) kvm_x86_ops->cache_reg(vcpu, reg); - return vcpu->arch.regs[reg]; + return kvm_vcpu_arch_state(&vcpu->arch)->regs[reg]; } static inline void kvm_register_write(struct kvm_vcpu *vcpu, enum kvm_reg reg, unsigned long val) { - vcpu->arch.regs[reg] = val; + kvm_vcpu_arch_state(&vcpu->arch)->regs[reg] = val; __set_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty); __set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail); } diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index af66b93902e5..486ad451a67d 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -196,6 +196,7 @@ struct vcpu_svm_hidden { struct { /* mimic topology in vcpu_svm: */ struct kvm_vcpu_arch_hidden arch; } vcpu; + unsigned long vmcb_pa; }; #endif @@ -1585,7 +1586,7 @@ static void init_vmcb(struct vcpu_svm *svm) save->dr6 = 0xffff0ff0; kvm_set_rflags(&svm->vcpu, 2); save->rip = 0x0000fff0; - svm->vcpu.arch.regs[VCPU_REGS_RIP] = save->rip; + kvm_vcpu_arch_state(&svm->vcpu.arch)->regs[VCPU_REGS_RIP] = save->rip; /* * svm_set_cr0() sets PG and WP and clears NW and CD on save->cr0. @@ -3150,7 +3151,7 @@ static int nested_svm_exit_handled_msr(struct vcpu_svm *svm) if (!(svm->nested.intercept & (1ULL << INTERCEPT_MSR_PROT))) return NESTED_EXIT_HOST; - msr = svm->vcpu.arch.regs[VCPU_REGS_RCX]; + msr = kvm_vcpu_arch_state(&svm->vcpu.arch)->regs[VCPU_REGS_RCX]; offset = svm_msrpm_offset(msr); write = svm->vmcb->control.exit_info_1 & 1; mask = 1 << ((2 * (msr & 0xf)) + write); @@ -5656,10 +5657,11 @@ static void svm_cancel_injection(struct kvm_vcpu *vcpu) static void svm_vcpu_run(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + unsigned long *regs = kvm_vcpu_arch_state(&vcpu->arch)->regs; - svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX]; - svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP]; - svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP]; + svm->vmcb->save.rax = regs[VCPU_REGS_RAX]; + svm->vmcb->save.rsp = regs[VCPU_REGS_RSP]; + svm->vmcb->save.rip = regs[VCPU_REGS_RIP]; /* * A vmexit emulation is required before the vcpu can be executed @@ -5690,6 +5692,10 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) svm->vmcb->save.cr2 = vcpu->arch.cr2; +#ifdef CONFIG_KVM_PROCLOCAL + svm->hidden->vmcb_pa = svm->vmcb_pa; +#endif + clgi(); /* @@ -5765,24 +5771,31 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) "xor %%edi, %%edi \n\t" "pop %%" _ASM_BP : +#ifdef CONFIG_KVM_PROCLOCAL + : [svm]"a"(svm->hidden), +#define SVM_STATE_STRUCT vcpu_svm_hidden +#else : [svm]"a"(svm), - [vmcb]"i"(offsetof(struct vcpu_svm, vmcb_pa)), - [rbx]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_RBX])), - [rcx]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_RCX])), - [rdx]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_RDX])), - [rsi]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_RSI])), - [rdi]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_RDI])), - [rbp]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_RBP])) +#define SVM_STATE_STRUCT vcpu_svm +#endif + [vmcb]"i"(offsetof(struct SVM_STATE_STRUCT, vmcb_pa)), + [rbx]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RBX])), + [rcx]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RCX])), + [rdx]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RDX])), + [rsi]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RSI])), + [rdi]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RDI])), + [rbp]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RBP])) #ifdef CONFIG_X86_64 - , [r8]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R8])), - [r9]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R9])), - [r10]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R10])), - [r11]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R11])), - [r12]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R12])), - [r13]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R13])), - [r14]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R14])), - [r15]"i"(offsetof(struct vcpu_svm, vcpu.arch.regs[VCPU_REGS_R15])) + , [r8]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R8])), + [r9]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R9])), + [r10]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R10])), + [r11]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R11])), + [r12]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R12])), + [r13]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R13])), + [r14]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R14])), + [r15]"i"(offsetof(struct SVM_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R15])) #endif +#undef SVM_STATE_STRUCT : "cc", "memory" #ifdef CONFIG_X86_64 , "rbx", "rcx", "rdx", "rsi", "rdi" @@ -5829,9 +5842,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl); vcpu->arch.cr2 = svm->vmcb->save.cr2; - vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax; - vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp; - vcpu->arch.regs[VCPU_REGS_RIP] = svm->vmcb->save.rip; + regs[VCPU_REGS_RAX] = svm->vmcb->save.rax; + regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp; + regs[VCPU_REGS_RIP] = svm->vmcb->save.rip; if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI)) kvm_before_interrupt(&svm->vcpu); @@ -6265,14 +6278,16 @@ static int svm_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate) int ret; if (is_guest_mode(vcpu)) { + unsigned long *regs = kvm_vcpu_arch_state(&vcpu->arch)->regs; + /* FED8h - SVM Guest */ put_smstate(u64, smstate, 0x7ed8, 1); /* FEE0h - SVM Guest VMCB Physical Address */ put_smstate(u64, smstate, 0x7ee0, svm->nested.vmcb); - svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX]; - svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP]; - svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP]; + svm->vmcb->save.rax = regs[VCPU_REGS_RAX]; + svm->vmcb->save.rsp = regs[VCPU_REGS_RSP]; + svm->vmcb->save.rip = regs[VCPU_REGS_RIP]; ret = nested_svm_vmexit(svm); if (ret) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 0fe9a4ab8268..0d2d6b7b0d50 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -982,6 +982,10 @@ struct vcpu_vmx_hidden { struct { /* mimic topology in vcpu_svm: */ struct kvm_vcpu_arch_hidden arch; } vcpu; + bool __launched; /* temporary, used in vmx_vcpu_run */ + /* shadow fields, used in vmx_vcpu_run */ + u8 fail; + unsigned long host_rsp; }; #endif @@ -1024,7 +1028,9 @@ struct vcpu_vmx { struct loaded_vmcs vmcs01; struct loaded_vmcs *loaded_vmcs; struct loaded_vmcs *loaded_cpu_state; +#ifndef CONFIG_KVM_PROCLOCAL bool __launched; /* temporary, used in vmx_vcpu_run */ +#endif struct msr_autoload { struct vmx_msrs guest; struct vmx_msrs host; @@ -4612,10 +4618,10 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) __set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail); switch (reg) { case VCPU_REGS_RSP: - vcpu->arch.regs[VCPU_REGS_RSP] = vmcs_readl(GUEST_RSP); + kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RSP] = vmcs_readl(GUEST_RSP); break; case VCPU_REGS_RIP: - vcpu->arch.regs[VCPU_REGS_RIP] = vmcs_readl(GUEST_RIP); + kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RIP] = vmcs_readl(GUEST_RIP); break; case VCPU_EXREG_PDPTR: if (enable_ept) @@ -6965,7 +6971,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->rmode.vm86_active = 0; vmx->spec_ctrl = 0; - vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val(); + kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RDX] = get_rdx_init_val(); kvm_set_cr8(vcpu, 0); if (!init_event) { @@ -7712,7 +7718,7 @@ static int handle_cpuid(struct kvm_vcpu *vcpu) static int handle_rdmsr(struct kvm_vcpu *vcpu) { - u32 ecx = vcpu->arch.regs[VCPU_REGS_RCX]; + u32 ecx = kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RCX]; struct msr_data msr_info; msr_info.index = ecx; @@ -7726,17 +7732,17 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu) trace_kvm_msr_read(ecx, msr_info.data); /* FIXME: handling of bits 32:63 of rax, rdx */ - vcpu->arch.regs[VCPU_REGS_RAX] = msr_info.data & -1u; - vcpu->arch.regs[VCPU_REGS_RDX] = (msr_info.data >> 32) & -1u; + kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RAX] = msr_info.data & -1u; + kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RDX] = (msr_info.data >> 32) & -1u; return kvm_skip_emulated_instruction(vcpu); } static int handle_wrmsr(struct kvm_vcpu *vcpu) { struct msr_data msr; - u32 ecx = vcpu->arch.regs[VCPU_REGS_RCX]; - u64 data = (vcpu->arch.regs[VCPU_REGS_RAX] & -1u) - | ((u64)(vcpu->arch.regs[VCPU_REGS_RDX] & -1u) << 32); + u32 ecx = kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RCX]; + u64 data = (kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RAX] & -1u) + | ((u64)(kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RDX] & -1u) << 32); msr.data = data; msr.index = ecx; @@ -10036,7 +10042,7 @@ static bool valid_ept_address(struct kvm_vcpu *vcpu, u64 address) static int nested_vmx_eptp_switching(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { - u32 index = vcpu->arch.regs[VCPU_REGS_RCX]; + u32 index = kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RCX]; u64 address; bool accessed_dirty; struct kvm_mmu *mmu = vcpu->arch.walk_mmu; @@ -10082,7 +10088,7 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmcs12 *vmcs12; - u32 function = vcpu->arch.regs[VCPU_REGS_RAX]; + u32 function = kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RAX]; /* * VMFUNC is only supported for nested guests, but we always enable the @@ -10241,7 +10247,7 @@ static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu, static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, u32 exit_reason) { - u32 msr_index = vcpu->arch.regs[VCPU_REGS_RCX]; + u32 msr_index = kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RCX]; gpa_t bitmap; if (!nested_cpu_has(vmcs12, CPU_BASED_USE_MSR_BITMAPS)) @@ -11467,9 +11473,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) } if (test_bit(VCPU_REGS_RSP, (unsigned long *)&vcpu->arch.regs_dirty)) - vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]); + vmcs_writel(GUEST_RSP, kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RSP]); if (test_bit(VCPU_REGS_RIP, (unsigned long *)&vcpu->arch.regs_dirty)) - vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]); + vmcs_writel(GUEST_RIP, kvm_vcpu_arch_state(&vcpu->arch)->regs[VCPU_REGS_RIP]); cr3 = __get_current_cr3_fast(); if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) { @@ -11508,7 +11514,12 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) */ x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0); +#ifdef CONFIG_KVM_PROCLOCAL + vmx->hidden->__launched = vmx->loaded_vmcs->launched; + vmx->hidden->host_rsp = vmx->host_rsp; +#else vmx->__launched = vmx->loaded_vmcs->launched; +#endif evmcs_rsp = static_branch_unlikely(&enable_evmcs) ? (unsigned long)¤t_evmcs->host_rsp : 0; @@ -11588,27 +11599,35 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) ".global vmx_return \n\t" "vmx_return: " _ASM_PTR " 2b \n\t" ".popsection" - : : "c"(vmx), "d"((unsigned long)HOST_RSP), "S"(evmcs_rsp), - [launched]"i"(offsetof(struct vcpu_vmx, __launched)), - [fail]"i"(offsetof(struct vcpu_vmx, fail)), - [host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)), - [rax]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RAX])), - [rbx]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RBX])), - [rcx]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RCX])), - [rdx]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RDX])), - [rsi]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RSI])), - [rdi]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RDI])), - [rbp]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_RBP])), +#ifdef CONFIG_KVM_PROCLOCAL + : : "c"(vmx->hidden), +#define VMX_STATE_STRUCT vcpu_vmx_hidden +#else + : : "c"(vmx), +#define VMX_STATE_STRUCT vcpu_vmx +#endif + "d"((unsigned long)HOST_RSP), "S"(evmcs_rsp), + [launched]"i"(offsetof(struct VMX_STATE_STRUCT, __launched)), + [fail]"i"(offsetof(struct VMX_STATE_STRUCT, fail)), + [host_rsp]"i"(offsetof(struct VMX_STATE_STRUCT, host_rsp)), + [rax]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RAX])), + [rbx]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RBX])), + [rcx]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RCX])), + [rdx]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RDX])), + [rsi]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RSI])), + [rdi]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RDI])), + [rbp]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_RBP])), #ifdef CONFIG_X86_64 - [r8]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R8])), - [r9]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R9])), - [r10]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R10])), - [r11]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R11])), - [r12]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R12])), - [r13]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R13])), - [r14]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R14])), - [r15]"i"(offsetof(struct vcpu_vmx, vcpu.arch.regs[VCPU_REGS_R15])), + [r8]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R8])), + [r9]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R9])), + [r10]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R10])), + [r11]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R11])), + [r12]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R12])), + [r13]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R13])), + [r14]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R14])), + [r15]"i"(offsetof(struct VMX_STATE_STRUCT, vcpu.arch.regs[VCPU_REGS_R15])), #endif +#undef VMX_STATE_STRUCT [wordsize]"i"(sizeof(ulong)) : "cc", "memory" #ifdef CONFIG_X86_64 @@ -11671,6 +11690,11 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu) /* Eliminate branch target predictions from guest mode */ vmexit_fill_RSB(); +#ifdef CONFIG_KVM_PROCLOCAL + vmx->fail = vmx->hidden->fail; + vmx->host_rsp = vmx->hidden->host_rsp; +#endif + vcpu->arch.cr2 = read_cr2(); /* All fields are clean at this point */ @@ -11794,7 +11818,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu) free_loaded_vmcs(vmx->loaded_vmcs); kfree(vmx->guest_msrs); kvm_vcpu_uninit(vcpu); - kmem_cache_free(kvm_vcpu_cache, vmx); + kmem_cache_free(kvm_vcpu_cache, vmx); } static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) @@ -13570,7 +13594,12 @@ static int __noclone nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) vmx->loaded_vmcs->host_state.cr4 = cr4; } +#ifdef CONFIG_KVM_PROCLOCAL + vmx->hidden->__launched = vmx->loaded_vmcs->launched; + vmx->hidden->host_rsp = vmx->host_rsp; +#else vmx->__launched = vmx->loaded_vmcs->launched; +#endif asm( /* Set HOST_RSP */ @@ -13593,12 +13622,19 @@ static int __noclone nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) ".global vmx_early_consistency_check_return\n\t" "vmx_early_consistency_check_return: " _ASM_PTR " 2b\n\t" ".popsection" - : - : "c"(vmx), "d"((unsigned long)HOST_RSP), - [launched]"i"(offsetof(struct vcpu_vmx, __launched)), - [fail]"i"(offsetof(struct vcpu_vmx, fail)), - [host_rsp]"i"(offsetof(struct vcpu_vmx, host_rsp)) +#ifdef CONFIG_KVM_PROCLOCAL + : : "c"(vmx->hidden), +#define VMX_STATE_STRUCT vcpu_vmx_hidden +#else + : : "c"(vmx), +#define VMX_STATE_STRUCT vcpu_vmx +#endif + "d"((unsigned long)HOST_RSP), + [launched]"i"(offsetof(struct VMX_STATE_STRUCT, __launched)), + [fail]"i"(offsetof(struct VMX_STATE_STRUCT, fail)), + [host_rsp]"i"(offsetof(struct VMX_STATE_STRUCT, host_rsp)) : "rax", "cc", "memory" +#undef VMX_STATE_STRUCT ); vmcs_writel(HOST_RIP, vmx_return); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2cfb96ca8cc8..35e41a772807 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9048,7 +9048,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->arch.xcr0 = XFEATURE_MASK_FP; } - memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs)); + memset(kvm_vcpu_arch_state(&vcpu->arch)->regs, 0, sizeof(kvm_arch_regs_t)); vcpu->arch.regs_avail = ~0; vcpu->arch.regs_dirty = ~0; From patchwork Wed Jun 12 17:08:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marius Hillenbrand X-Patchwork-Id: 10990453 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B88C13AF for ; Wed, 12 Jun 2019 17:12:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 714CC28841 for ; Wed, 12 Jun 2019 17:12:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6564128A17; Wed, 12 Jun 2019 17:12:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A97E428841 for ; Wed, 12 Jun 2019 17:12:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0B336B026A; Wed, 12 Jun 2019 13:12:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BBB7F6B026B; Wed, 12 Jun 2019 13:12:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AAA4C6B026F; Wed, 12 Jun 2019 13:12:33 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-vs1-f72.google.com (mail-vs1-f72.google.com [209.85.217.72]) by kanga.kvack.org (Postfix) with ESMTP id 858416B026A for ; Wed, 12 Jun 2019 13:12:33 -0400 (EDT) Received: by mail-vs1-f72.google.com with SMTP id x22so1883434vsj.1 for ; Wed, 12 Jun 2019 10:12:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1l13rRWuFlC2jtnumLOMqJEjamUkvWPQRDS+qbpg+64=; b=aJpG2l8vA6xFqLcagyDx4OlButQs/iy4pHnbB0zKrMsT/tbdZRX90c1sfRObB7R39f ZHYWHVNyX4YbJ4JoqggYW7NDOEp7oUP0oRity7kcxXn7G44knhtdROpvE0L5XpNuV82Z t0a9N2WI9dDI/nJ3xNWTX7vN18yz6y627R62u4DTQ4el/eKmmDrowrg+FlRix+iPS8pU cSYOTlUbnAQRHi6MCcPAQv1LnacD1WL7IqW3CNaSxf9Z50hPzVBMZr007H/+GP5UEOdw Hv2ML5xz5Ml/bySoXAC+TKNgOJzPfyrJU+Sp6+WhbAPUIYbWS1PcZTxXNN7LBCZqqi/P IhXQ== X-Gm-Message-State: APjAAAULJCWucW+bbhAu5SadPE0CXOqm2r+dQ9++vpBqkyfp6H2tDy1/ C5vHw3xbj6kdru9+JV2FY5D8Vw6/XUQsheulPfrsgIee7xgpHQe2E8ByPUEnvDbgoL4O7hvk/Nj K9hXXwfors0npr1w/yKVxrDmDasdSqPD8J1hM76xDC52vbsh6QLM1GfYWrQ4hLn+4+w== X-Received: by 2002:a67:ebcb:: with SMTP id y11mr5574198vso.138.1560359553217; Wed, 12 Jun 2019 10:12:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqzWYwTMwez1sfue5lxed/psTe5JgQg6m8pzs/uQqMNsLshAJn8bZM1gLBhQamnOb/zVLiaC X-Received: by 2002:a67:ebcb:: with SMTP id y11mr5574116vso.138.1560359552468; Wed, 12 Jun 2019 10:12:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560359552; cv=none; d=google.com; s=arc-20160816; b=xpRfBQQZdbtZgDb/SGd4CKV3Ynu8T7bs/RIEd3/QIVLgGYcQ2dU6YsTv7dncw/YqeY S247vSepSz0fC6A26m3kvafzGqgQEwNzJtOglA3QxVG7yiChVrpL1sHsZwKLyzMacg3V 8rujAwrjHH/iMfG1uXtPKm+lO+m6TCqved1UePVGh7HtXVqq/LRwVriENl4wnebiTU1e O47+bxbtO3IeGjDzHg+kElJqgQ9IKP3JA68E5inehtk6vYZRpMyOjvBbCotMV69zrbVv sAWRxNcTa5pt3WqdAL3gNdeBEfc4qaS30UtMEXwdlrTKraCsTpFoINI0mtK5lpC+dWiL kxeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature; bh=1l13rRWuFlC2jtnumLOMqJEjamUkvWPQRDS+qbpg+64=; b=rF3eIyOfv75uFLHVtz+YwnvSdOfaILKAmKqIH/nTq4g7zsbfIG+31I2+jLY2dNJCEt yZM7M6GPTJ9MgUn4Qy78MoLAiWrE04pOjnXowpp9KWV01SweL68nuK0doDASs7yYPIZX CY/in6YQcXCniWGsl5Gbm4qoY0auDqMUt4vPf8biio3fGrh3HVQQDExOZrtE6iHlll1P UBvSGO2LSH18oOEydAVMoPugy0oCyxHVlDnyPvF326mSsAPTHxpc1AMsqjQ5nYUV68T1 PMPprOlOGyjwtCYJoIr+dEytibrpNSFZxZloD0CbVq3yaUMucn6w+YIc1U+LLPuLmNJ8 T5ww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b="uk/18V8h"; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: from smtp-fw-33001.amazon.com (smtp-fw-33001.amazon.com. [207.171.190.10]) by mx.google.com with ESMTPS id b65si84188vsb.403.2019.06.12.10.12.32 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jun 2019 10:12:32 -0700 (PDT) Received-SPF: pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) client-ip=207.171.190.10; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b="uk/18V8h"; spf=pass (google.com: domain of prvs=059bff19d=mhillenb@amazon.com designates 207.171.190.10 as permitted sender) smtp.mailfrom="prvs=059bff19d=mhillenb@amazon.com"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1560359552; x=1591895552; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1l13rRWuFlC2jtnumLOMqJEjamUkvWPQRDS+qbpg+64=; b=uk/18V8hWiCKkVWuXL+SyHhzilKdjRSv0RC4LZ0o9eqeou4CKbEIUUDh 863ATNCWmA85T2Dtp3itccqngvWdOIPngEAd/sfRiKyL5hePWmBnTW7zO NEctAVSYBT+6jMV6GwqNZzjcaMekLnylqHljtwllmeStw3Wz9N98yrSUt w=; X-IronPort-AV: E=Sophos;i="5.62,366,1554768000"; d="scan'208";a="805048939" Received: from sea3-co-svc-lb6-vlan2.sea.amazon.com (HELO email-inbound-relay-1a-807d4a99.us-east-1.amazon.com) ([10.47.22.34]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 12 Jun 2019 17:12:30 +0000 Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (iad7-ws-svc-lb50-vlan2.amazon.com [10.0.93.210]) by email-inbound-relay-1a-807d4a99.us-east-1.amazon.com (Postfix) with ESMTPS id 44B25A05E6; Wed, 12 Jun 2019 17:12:28 +0000 (UTC) Received: from ua08cfdeba6fe59dc80a8.ant.amazon.com (ua08cfdeba6fe59dc80a8.ant.amazon.com [127.0.0.1]) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id x5CHCPHK018985; Wed, 12 Jun 2019 19:12:25 +0200 Received: (from mhillenb@localhost) by ua08cfdeba6fe59dc80a8.ant.amazon.com (8.15.2/8.15.2/Submit) id x5CHCPu3018963; Wed, 12 Jun 2019 19:12:25 +0200 From: Marius Hillenbrand To: kvm@vger.kernel.org Cc: Marius Hillenbrand , linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, linux-mm@kvack.org, Alexander Graf , David Woodhouse , Julian Stecklina Subject: [RFC 10/10] kvm, x86: move guest FPU state into process local memory Date: Wed, 12 Jun 2019 19:08:44 +0200 Message-Id: <20190612170834.14855-11-mhillenb@amazon.de> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190612170834.14855-1-mhillenb@amazon.de> References: <20190612170834.14855-1-mhillenb@amazon.de> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP FPU registers contain guest data and must be protected from information leak vulnerabilities in the kernel. FPU register state for vCPUs are allocated from the globally-visible kernel heap. Change this to use process-local memory instead and thus prevent access (or prefetching) in any other context in the kernel. Signed-off-by: Marius Hillenbrand Inspired-by: Julian Stecklina (while jsteckli@amazon.de) Cc: Alexander Graf Cc: David Woodhouse --- arch/x86/include/asm/kvm_host.h | 8 ++++++++ arch/x86/kvm/x86.c | 24 ++++++++++++------------ 2 files changed, 20 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4896ecde1c11..b3574217b011 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -36,6 +36,7 @@ #include #include #include +#include #define KVM_MAX_VCPUS 288 #define KVM_SOFT_MAX_VCPUS 240 @@ -545,6 +546,7 @@ struct kvm_vcpu_arch_hidden { * kvm_{register,rip}_{read,write} functions. */ kvm_arch_regs_t regs; + struct fpu guest_fpu; }; #endif @@ -631,9 +633,15 @@ struct kvm_vcpu_arch { * it is switched out separately at VMENTER and VMEXIT time. The * "guest_fpu" state here contains the guest FPU context, with the * host PRKU bits. + * + * With process-local memory, the guest FPU state will be hidden in + * kvm_vcpu_arch_hidden. Thus, access to this struct must go through + * kvm_vcpu_arch_state(vcpu). */ struct fpu user_fpu; +#ifndef CONFIG_KVM_PROCLOCAL struct fpu guest_fpu; +#endif u64 xcr0; u64 guest_supported_xcr0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 35e41a772807..480b4ed438ae 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3792,7 +3792,7 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct kvm_vcpu *vcpu, static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) { - struct xregs_state *xsave = &vcpu->arch.guest_fpu.state.xsave; + struct xregs_state *xsave = &kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.xsave; u64 xstate_bv = xsave->header.xfeatures; u64 valid; @@ -3834,7 +3834,7 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu) static void load_xsave(struct kvm_vcpu *vcpu, u8 *src) { - struct xregs_state *xsave = &vcpu->arch.guest_fpu.state.xsave; + struct xregs_state *xsave = &kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.xsave; u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET); u64 valid; @@ -3882,7 +3882,7 @@ static void kvm_vcpu_ioctl_x86_get_xsave(struct kvm_vcpu *vcpu, fill_xsave((u8 *) guest_xsave->region, vcpu); } else { memcpy(guest_xsave->region, - &vcpu->arch.guest_fpu.state.fxsave, + &kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.fxsave, sizeof(struct fxregs_state)); *(u64 *)&guest_xsave->region[XSAVE_HDR_OFFSET / sizeof(u32)] = XFEATURE_MASK_FPSSE; @@ -3912,7 +3912,7 @@ static int kvm_vcpu_ioctl_x86_set_xsave(struct kvm_vcpu *vcpu, if (xstate_bv & ~XFEATURE_MASK_FPSSE || mxcsr & ~mxcsr_feature_mask) return -EINVAL; - memcpy(&vcpu->arch.guest_fpu.state.fxsave, + memcpy(&kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.fxsave, guest_xsave->region, sizeof(struct fxregs_state)); } return 0; @@ -8302,7 +8302,7 @@ static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu) preempt_disable(); copy_fpregs_to_fpstate(&vcpu->arch.user_fpu); /* PKRU is separately restored in kvm_x86_ops->run. */ - __copy_kernel_to_fpregs(&vcpu->arch.guest_fpu.state, + __copy_kernel_to_fpregs(&kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state, ~XFEATURE_MASK_PKRU); preempt_enable(); trace_kvm_fpu(1); @@ -8312,7 +8312,7 @@ static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu) static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu) { preempt_disable(); - copy_fpregs_to_fpstate(&vcpu->arch.guest_fpu); + copy_fpregs_to_fpstate(&kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu); copy_kernel_to_fpregs(&vcpu->arch.user_fpu.state); preempt_enable(); ++vcpu->stat.fpu_reload; @@ -8807,7 +8807,7 @@ int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) vcpu_load(vcpu); - fxsave = &vcpu->arch.guest_fpu.state.fxsave; + fxsave = &kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.fxsave; memcpy(fpu->fpr, fxsave->st_space, 128); fpu->fcw = fxsave->cwd; fpu->fsw = fxsave->swd; @@ -8827,7 +8827,7 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) vcpu_load(vcpu); - fxsave = &vcpu->arch.guest_fpu.state.fxsave; + fxsave = &kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.fxsave; memcpy(fxsave->st_space, fpu->fpr, 128); fxsave->cwd = fpu->fcw; @@ -8883,9 +8883,9 @@ static int sync_regs(struct kvm_vcpu *vcpu) static void fx_init(struct kvm_vcpu *vcpu) { - fpstate_init(&vcpu->arch.guest_fpu.state); + fpstate_init(&kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state); if (boot_cpu_has(X86_FEATURE_XSAVES)) - vcpu->arch.guest_fpu.state.xsave.header.xcomp_bv = + kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.xsave.header.xcomp_bv = host_xcr0 | XSTATE_COMPACTION_ENABLED; /* @@ -9009,11 +9009,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) */ if (init_event) kvm_put_guest_fpu(vcpu); - mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu.state.xsave, + mpx_state_buffer = get_xsave_addr(&kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.xsave, XFEATURE_MASK_BNDREGS); if (mpx_state_buffer) memset(mpx_state_buffer, 0, sizeof(struct mpx_bndreg_state)); - mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu.state.xsave, + mpx_state_buffer = get_xsave_addr(&kvm_vcpu_arch_state(&vcpu->arch)->guest_fpu.state.xsave, XFEATURE_MASK_BNDCSR); if (mpx_state_buffer) memset(mpx_state_buffer, 0, sizeof(struct mpx_bndcsr));